Search | arXiv e-print repository

The Economic Effect of Gaining a New Qualification Later in Life

Authors: Finn Lattimore, Daniel M. Steinberg, Anna Zhu

Abstract: Pursuing educational qualifications later in life is an increasingly common phenomenon within OECD countries since technological change and automation continues to drive the evolution of skills needed in many professions. We focus on the causal impacts to economic returns of degrees completed later in life, where motivations and capabilities to acquire additional education may be distinct from edu… ▽ More Pursuing educational qualifications later in life is an increasingly common phenomenon within OECD countries since technological change and automation continues to drive the evolution of skills needed in many professions. We focus on the causal impacts to economic returns of degrees completed later in life, where motivations and capabilities to acquire additional education may be distinct from education in early years. We find that completing an additional degree leads to more than \$3000 (AUD, 2019) extra income per year compared to those who do not complete additional study. For outcomes, treatment and controls we use the extremely rich and nationally representative longitudinal data from the Household Income and Labour Dynamics Australia survey (HILDA). To take full advantage of the complexity and richness of this data we use a Machine Learning (ML) based methodology for causal effect estimation. We are also able to use ML to discover sources of heterogeneity in the effects of gaining additional qualifications. For example, those younger than 45 years of age when obtaining additional qualifications tend to reap more benefits (as much as \$50 per week more) than others. △ Less

Submitted 21 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

Comments: 80 pages, 16 figures

arXiv:2205.05624 [pdf, other]

Leveraging baseline covariates to analyze small cluster-randomized trials with a rare binary outcome

Authors: Angela Y. Zhu, Nandita Mitra, Karla Hemming, Michael O. Harhay, Fan Li

Abstract: Cluster-randomized trials (CRTs) involve randomizing entire groups of participants -- called clusters -- to treatment arms but are often comprised of a limited or fixed number of available clusters. While covariate adjustment can account for chance imbalances between treatment arms and increase statistical efficiency in individually-randomized trials, analytical methods for individual-level covari… ▽ More Cluster-randomized trials (CRTs) involve randomizing entire groups of participants -- called clusters -- to treatment arms but are often comprised of a limited or fixed number of available clusters. While covariate adjustment can account for chance imbalances between treatment arms and increase statistical efficiency in individually-randomized trials, analytical methods for individual-level covariate adjustment in small CRTs have received little attention to date. In this paper, we systematically investigate, through extensive simulations, the operating characteristics of propensity score weighting and multivariable regression as two individual-level covariate adjustment strategies for estimating the participant-average causal effect in small CRTs with a rare binary outcome and identify scenarios where each adjustment strategy has a relative efficiency advantage over the other to make practical recommendations. We also examine the finite-sample performance of the bias-corrected sandwich variance estimators associated with propensity score weighting and multivariable regression for quantifying the uncertainty in estimating the participant-average treatment effect. To illustrate the methods for individual-level covariate adjustment, we reanalyze a recent CRT testing a sedation protocol in 31 pediatric intensive care units. △ Less

Submitted 28 November, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

arXiv:2203.14702 [pdf, other]

Bi-level Doubly Variational Learning for Energy-based Latent Variable Models

Authors: Ge Kan, **hu Lü, Tian Wang, Baochang Zhang, Aichun Zhu, Lei Huang, Guodong Guo, Hichem Snoussi

Abstract: Energy-based latent variable models (EBLVMs) are more expressive than conventional energy-based models. However, its potential on visual tasks are limited by its training process based on maximum likelihood estimate that requires sampling from two intractable distributions. In this paper, we propose Bi-level doubly variational learning (BiDVL), which is based on a new bi-level optimization framewo… ▽ More Energy-based latent variable models (EBLVMs) are more expressive than conventional energy-based models. However, its potential on visual tasks are limited by its training process based on maximum likelihood estimate that requires sampling from two intractable distributions. In this paper, we propose Bi-level doubly variational learning (BiDVL), which is based on a new bi-level optimization framework and two tractable variational distributions to facilitate learning EBLVMs. Particularly, we lead a decoupled EBLVM consisting of a marginal energy-based distribution and a structural posterior to handle the difficulties when learning deep EBLVMs on images. By choosing a symmetric KL divergence in the lower level of our framework, a compact BiDVL for visual tasks can be obtained. Our model achieves impressive image generation performance over related works. It also demonstrates the significant capacity of testing image reconstruction and out-of-distribution detection. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: CVPR 2022

arXiv:2003.03032 [pdf]

Modeling Spontaneous Exit Choices in Intercity Expressway Traffic with Quantum Walk

Authors: Zhaoyuan Yu, Xinxin Zhou, Xu Hu, Wen Luo, Linwang Yuan, A-Xing Zhu

Abstract: In intercity expressway traffic, a driver frequently makes decisions to adjust driving behavior according to time, location and traffic conditions, which further affects when and where the driver will leave away from the expressway traffic. Spontaneous exit choices by drivers are hard to observe and thus it is a challenge to model intercity expressway traffic sufficiently. In this paper, we develo… ▽ More In intercity expressway traffic, a driver frequently makes decisions to adjust driving behavior according to time, location and traffic conditions, which further affects when and where the driver will leave away from the expressway traffic. Spontaneous exit choices by drivers are hard to observe and thus it is a challenge to model intercity expressway traffic sufficiently. In this paper, we developed a Spontaneous Quantum Traffic Model (SQTM), which models the stochastic traffic fluctuation caused by spontaneous exit choices and the residual regularity fluctuation with Quantum Walk and Autoregressive Moving Average model (ARMA), respectively. SQTM considers the spontaneous exit choice of a driver as a quantum stochastic process with a dynamical probability function varies according to time, location and traffic conditions. A quantum walk is applied to update the probability function, which simulates when and where a driver will leave the traffic affected by spontaneous exit choices. We validate our model with hourly traffic data from 7 exits from the Nan**g-Changzhou expressway in Eastern China. For the 7 exits, the coefficients of determination of SQTM ranged from 0.5 to 0.85. Compared with classical random walk and ARMA model, the coefficients of determination were increased by 21.28% to 104.98%, and relative mean square error decreased by 11.61% to 32.92%. We conclude that SQTM provides new potential for modeling traffic dynamics with consideration of unobservable spontaneous driver's decision-making. △ Less

Submitted 6 March, 2020; originally announced March 2020.

arXiv:2001.03750 [pdf, other]

SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems

Authors: Pengzhan **, Zhen Zhang, Aiqing Zhu, Yifa Tang, George Em Karniadakis

Abstract: We propose new symplectic networks (SympNets) for identifying Hamiltonian systems from data based on a composition of linear, activation and gradient modules. In particular, we define two classes of SympNets: the LA-SympNets composed of linear and activation modules, and the G-SympNets composed of gradient modules. Correspondingly, we prove two new universal approximation theorems that demonstrate… ▽ More We propose new symplectic networks (SympNets) for identifying Hamiltonian systems from data based on a composition of linear, activation and gradient modules. In particular, we define two classes of SympNets: the LA-SympNets composed of linear and activation modules, and the G-SympNets composed of gradient modules. Correspondingly, we prove two new universal approximation theorems that demonstrate that SympNets can approximate arbitrary symplectic maps based on appropriate activation functions. We then perform several experiments including the pendulum, double pendulum and three-body problems to investigate the expressivity and the generalization ability of SympNets. The simulation results show that even very small size SympNets can generalize well, and are able to handle both separable and non-separable Hamiltonian systems with data points resulting from short or long time steps. In all the test cases, SympNets outperform the baseline models, and are much faster in training and prediction. We also develop an extended version of SympNets to learn the dynamics from irregularly sampled data. This extended version of SympNets can be thought of as a universal model representing the solution to an arbitrary Hamiltonian system. △ Less

Submitted 19 August, 2020; v1 submitted 11 January, 2020; originally announced January 2020.

arXiv:1907.00693 [pdf, other]

Scene Text Magnifier

Authors: Toshiki Nakamura, Anna Zhu, Seiichi Uchida

Abstract: Scene text magnifier aims to magnify text in natural scene images without recognition. It could help the special groups, who have myopia or dyslexia to better understand the scene. In this paper, we design the scene text magnifier through interacted four CNN-based networks: character erasing, character extraction, character magnify, and image synthesis. The architecture of the networks are extende… ▽ More Scene text magnifier aims to magnify text in natural scene images without recognition. It could help the special groups, who have myopia or dyslexia to better understand the scene. In this paper, we design the scene text magnifier through interacted four CNN-based networks: character erasing, character extraction, character magnify, and image synthesis. The architecture of the networks are extended based on the hourglass encoder-decoders. It inputs the original scene text image and outputs the text magnified image while keeps the background unchange. Intermediately, we can get the side-output results of text erasing and text extraction. The four sub-networks are first trained independently and fine-tuned in end-to-end mode. The training samples for each stage are processed through a flow with original image and text annotation in ICDAR2013 and Flickr dataset as input, and corresponding text erased image, magnified text annotation, and text magnified scene image as output. To evaluate the performance of text magnifier, the Structural Similarity is used to measure the regional changes in each character region. The experimental results demonstrate our method can magnify scene text effectively without effecting the background. △ Less

Submitted 5 July, 2019; v1 submitted 16 June, 2019; originally announced July 2019.

Comments: to appear at the International Conference on Document Analysis and Recognition (ICDAR) 2019

arXiv:1710.10641 [pdf]

A Fast, Accurate Two-Step Linear Mixed Model for Genetic Analysis Applied to Repeat MRI Measurements

Authors: Qifan Yang, Gennady V. Roshchupkin, Wiro J. Niessen, Sarah E. Medland, Alyssa H. Zhu, Paul M. Thompson, Neda Jahanshad

Abstract: Large-scale biobanks are being collected around the world in efforts to better understand human health and risk factors for disease. They often survey hundreds of thousands of individuals, combining questionnaires with clinical, genetic, demographic, and imaging assessments; some of this data may be collected longitudinally. Genetic associations analysis of such datasets requires methods to proper… ▽ More Large-scale biobanks are being collected around the world in efforts to better understand human health and risk factors for disease. They often survey hundreds of thousands of individuals, combining questionnaires with clinical, genetic, demographic, and imaging assessments; some of this data may be collected longitudinally. Genetic associations analysis of such datasets requires methods to properly handle relatedness, population structure and other types of biases introduced by confounders. Most popular and accurate approaches rely on linear mixed model (LMM) algorithms, which are iterative and computational complexity of each iteration scales by the square of the sample size, slowing the pace of discoveries (up to several days for single trait analysis), and, furthermore, limiting the use of repeat phenotypic measurements. Here, we describe our new, non-iterative, much faster and accurate Two-Step Linear Mixed Model (Two-Step LMM) approach, that has a computational complexity that scales linearly with sample size. We show that the first step retains accurate estimates of the heritability (the proportion of the trait variance explained by additive genetic factors), even when increasingly complex genetic relationships between individuals are modeled. Second step provides a faster framework to obtain the effect sizes of covariates in regression model. We applied Two-Step LMM to real data from the UK Biobank, which recently released genoty** information and processed MRI data from 9,725 individuals. We used the left and right hippocampus volume (HV) as repeated measures, and observed increased and more accurate heritability estimation, consistent with simulations. △ Less

Submitted 15 March, 2019; v1 submitted 29 October, 2017; originally announced October 2017.

Comments: 2017 Neural Information Processing Systems (NeurIPS) BigNeuro Workshop

arXiv:1307.2855 [pdf, other]

doi 10.1137/1.9781611973402.94

Flow-Based Algorithms for Local Graph Clustering

Authors: Lorenzo Orecchia, Zeyuan Allen Zhu

Abstract: Given a subset S of vertices of an undirected graph G, the cut-improvement problem asks us to find a subset S that is similar to A but has smaller conductance. A very elegant algorithm for this problem has been given by Andersen and Lang [AL08] and requires solving a small number of single-commodity maximum flow computations over the whole graph G. In this paper, we introduce LocalImprove, the fir… ▽ More Given a subset S of vertices of an undirected graph G, the cut-improvement problem asks us to find a subset S that is similar to A but has smaller conductance. A very elegant algorithm for this problem has been given by Andersen and Lang [AL08] and requires solving a small number of single-commodity maximum flow computations over the whole graph G. In this paper, we introduce LocalImprove, the first cut-improvement algorithm that is local, i.e. that runs in time dependent on the size of the input set A rather than on the size of the entire graph. Moreover, LocalImprove achieves this local behaviour while essentially matching the same theoretical guarantee as the global algorithm of Andersen and Lang. The main application of LocalImprove is to the design of better local-graph-partitioning algorithms. All previously known local algorithms for graph partitioning are random-walk based and can only guarantee an output conductance of O(\sqrt{OPT}) when the target set has conductance OPT \in [0,1]. Very recently, Zhu, Lattanzi and Mirrokni [ZLM13] improved this to O(OPT / \sqrt{CONN}) where the internal connectivity parameter CONN \in [0,1] is defined as the reciprocal of the mixing time of the random walk over the graph induced by the target set. In this work, we show how to use LocalImprove to obtain a constant approximation O(OPT) as long as CONN/OPT = Omega(1). This yields the first flow-based algorithm. Moreover, its performance strictly outperforms the ones based on random walks and surprisingly matches that of the best known global algorithm, which is SDP-based, in this parameter regime [MMV12]. Finally, our results show that spectral methods are not the only viable approach to the construction of local graph partitioning algorithm and open door to the study of algorithms with even better approximation and locality guarantees. △ Less

Submitted 13 October, 2013; v1 submitted 10 July, 2013; originally announced July 2013.

Comments: A shorter version of this paper has appeared in the proceedings of the 25th ACM-SIAM Symposium on Discrete Algorithms (SODA) 2014

arXiv:1304.8132 [pdf, other]

Local Graph Clustering Beyond Cheeger's Inequality

Authors: Zeyuan Allen Zhu, Silvio Lattanzi, Vahab Mirrokni

Abstract: Motivated by applications of large-scale graph clustering, we study random-walk-based LOCAL algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. All previously known such algorithms guarantee an output conductance of $\tilde{O}(\sqrt{φ(A)})$ when the target set $A$ has conductance $φ(A)\in[0,1]$. In this paper, we improve it to… ▽ More Motivated by applications of large-scale graph clustering, we study random-walk-based LOCAL algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. All previously known such algorithms guarantee an output conductance of $\tilde{O}(\sqrt{φ(A)})$ when the target set $A$ has conductance $φ(A)\in[0,1]$. In this paper, we improve it to $$\tilde{O}\bigg( \min\Big\{\sqrt{φ(A)}, \frac{φ(A)}{\sqrt{\mathsf{Conn}(A)}} \Big\} \bigg)\enspace, $$ where the internal connectivity parameter $\mathsf{Conn}(A) \in [0,1]$ is defined as the reciprocal of the mixing time of the random walk over the induced subgraph on $A$. For instance, using $\mathsf{Conn}(A) = Ω(λ(A) / \log n)$ where $λ$ is the second eigenvalue of the Laplacian of the induced subgraph on $A$, our conductance guarantee can be as good as $\tilde{O}(φ(A)/\sqrt{λ(A)})$. This builds an interesting connection to the recent advance of the so-called improved Cheeger's Inequality [KKL+13], which says that global spectral algorithms can provide a conductance guarantee of $O(φ_{\mathsf{opt}}/\sqrt{λ_3})$ instead of $O(\sqrt{φ_{\mathsf{opt}}})$. In addition, we provide theoretical guarantee on the clustering accuracy (in terms of precision and recall) of the output set. We also prove that our analysis is tight, and perform empirical evaluation to support our theory on both synthetic and real data. It is worth noting that, our analysis outperforms prior work when the cluster is well-connected. In fact, the better it is well-connected inside, the more significant improvement (both in terms of conductance and accuracy) we can obtain. Our results shed light on why in practice some random-walk-based algorithms perform better than its previous theory, and help guide future research about local clustering. △ Less

Submitted 7 November, 2013; v1 submitted 30 April, 2013; originally announced April 2013.

Comments: An extended abstract of this paper has appeared in the proceedings of the 30th International Conference on Machine Learning (ICML 2013)

Showing 1–9 of 9 results for author: Zhu, A