-
Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization
Authors:
Xiaoyin Hu,
Nachuan Xiao,
Xin Liu,
Kim-Chuan Toh
Abstract:
In this paper, we propose a generalized framework for develo** learning-rate-free momentum stochastic gradient descent (SGD) methods in the minimization of nonsmooth nonconvex functions, especially in training nonsmooth neural networks. Our framework adaptively generates learning rates based on the historical data of stochastic subgradients and iterates. Under mild conditions, we prove that our…
▽ More
In this paper, we propose a generalized framework for develo** learning-rate-free momentum stochastic gradient descent (SGD) methods in the minimization of nonsmooth nonconvex functions, especially in training nonsmooth neural networks. Our framework adaptively generates learning rates based on the historical data of stochastic subgradients and iterates. Under mild conditions, we prove that our proposed framework enjoys global convergence to the stationary points of the objective function in the sense of the conservative field, hence providing convergence guarantees for training nonsmooth neural networks. Based on our proposed framework, we propose a novel learning-rate-free momentum SGD method (LFM). Preliminary numerical experiments reveal that LFM performs comparably to the state-of-the-art learning-rate-free methods (which have not been shown theoretically to be convergence) across well-known neural network training benchmarks.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Tests for principal eigenvalues and eigenvectors
Authors:
Jianqing Fan,
Yingying Li,
Ningning Xia,
Xinghua Zheng
Abstract:
We establish central limit theorems for principal eigenvalues and eigenvectors under a large factor model setting, and develop two-sample tests of both principal eigenvalues and principal eigenvectors. One important application is to detect structural breaks in large factor models. Compared with existing methods for detecting structural breaks, our tests provide unique insights into the source of…
▽ More
We establish central limit theorems for principal eigenvalues and eigenvectors under a large factor model setting, and develop two-sample tests of both principal eigenvalues and principal eigenvectors. One important application is to detect structural breaks in large factor models. Compared with existing methods for detecting structural breaks, our tests provide unique insights into the source of structural breaks because they can distinguish between individual principal eigenvalues and/or eigenvectors. We demonstrate the application by comparing the principal eigenvalues and principal eigenvectors of S\&P500 Index constituents' daily returns over different years.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Develo** Lagrangian-based Methods for Nonsmooth Nonconvex Optimization
Authors:
Nachuan Xiao,
Kuangyu Ding,
Xiaoyin Hu,
Kim-Chuan Toh
Abstract:
In this paper, we consider the minimization of a nonsmooth nonconvex objective function $f(x)$ over a closed convex subset $\mathcal{X}$ of $\mathbb{R}^n$, with additional nonsmooth nonconvex constraints $c(x) = 0$. We develop a unified framework for develo** Lagrangian-based methods, which takes a single-step update to the primal variables by some subgradient methods in each iteration. These su…
▽ More
In this paper, we consider the minimization of a nonsmooth nonconvex objective function $f(x)$ over a closed convex subset $\mathcal{X}$ of $\mathbb{R}^n$, with additional nonsmooth nonconvex constraints $c(x) = 0$. We develop a unified framework for develo** Lagrangian-based methods, which takes a single-step update to the primal variables by some subgradient methods in each iteration. These subgradient methods are ``embedded'' into our framework, in the sense that they are incorporated as black-box updates to the primal variables. We prove that our proposed framework inherits the global convergence guarantees from these embedded subgradient methods under mild conditions. In addition, we show that our framework can be extended to solve constrained optimization problems with expectation constraints. Based on the proposed framework, we show that a wide range of existing stochastic subgradient methods, including the proximal SGD, proximal momentum SGD, and proximal ADAM, can be embedded into Lagrangian-based methods. Preliminary numerical experiments on deep learning tasks illustrate that our proposed framework yields efficient variants of Lagrangian-based methods with convergence guarantees for nonconvex nonsmooth constrained optimization problems.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Decentralized Stochastic Subgradient Methods for Nonsmooth Nonconvex Optimization
Authors:
Siyuan Zhang,
Nachuan Xiao,
Xin Liu
Abstract:
In this paper, we concentrate on decentralized optimization problems with nonconvex and nonsmooth objective functions, especially on the decentralized training of nonsmooth neural networks. We introduce a unified framework to analyze the global convergence of decentralized stochastic subgradient-based methods. We prove the global convergence of our proposed framework under mild conditions, by esta…
▽ More
In this paper, we concentrate on decentralized optimization problems with nonconvex and nonsmooth objective functions, especially on the decentralized training of nonsmooth neural networks. We introduce a unified framework to analyze the global convergence of decentralized stochastic subgradient-based methods. We prove the global convergence of our proposed framework under mild conditions, by establishing that the generated sequence asymptotically approximates the trajectories of its associated differential inclusion. Furthermore, we establish that our proposed framework covers a wide range of existing efficient decentralized subgradient-based methods, including decentralized stochastic subgradient descent (DSGD), DSGD with gradient-tracking technique (DSGD-T), and DSGD with momentum (DSGD-M). In addition, we introduce the sign map to regularize the update directions in DSGD-M, and show it is enclosed in our proposed framework. Consequently, our convergence results establish, for the first time, global convergence of these methods when applied to nonsmooth nonconvex objectives. Preliminary numerical experiments demonstrate that our proposed framework yields highly efficient decentralized subgradient-based methods with convergence guarantees in the training of nonsmooth neural networks.
△ Less
Submitted 27 June, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
APT-MMF: An advanced persistent threat actor attribution method based on multimodal and multilevel feature fusion
Authors:
Nan Xiao,
Bo Lang,
Ting Wang,
Yikai Chen
Abstract:
Threat actor attribution is a crucial defense strategy for combating advanced persistent threats (APTs). Cyber threat intelligence (CTI), which involves analyzing multisource heterogeneous data from APTs, plays an important role in APT actor attribution. The current attribution methods extract features from different CTI perspectives and employ machine learning models to classify CTI reports accor…
▽ More
Threat actor attribution is a crucial defense strategy for combating advanced persistent threats (APTs). Cyber threat intelligence (CTI), which involves analyzing multisource heterogeneous data from APTs, plays an important role in APT actor attribution. The current attribution methods extract features from different CTI perspectives and employ machine learning models to classify CTI reports according to their threat actors. However, these methods usually extract only one kind of feature and ignore heterogeneous information, especially the attributes and relations of indicators of compromise (IOCs), which form the core of CTI. To address these problems, we propose an APT actor attribution method based on multimodal and multilevel feature fusion (APT-MMF). First, we leverage a heterogeneous attributed graph to characterize APT reports and their IOC information. Then, we extract and fuse multimodal features, including attribute type features, natural language text features and topological relationship features, to construct comprehensive node representations. Furthermore, we design multilevel heterogeneous graph attention networks to learn the deep hidden features of APT report nodes; these networks integrate IOC type-level, metapath-based neighbor node-level, and metapath semantic-level attention. Utilizing multisource threat intelligence, we construct a heterogeneous attributed graph dataset for verification purposes. The experimental results show that our method not only outperforms the existing methods but also demonstrates its good interpretability for attribution analysis tasks.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning
Authors:
Fudan Zheng,
**dong Cao,
Weijiang Yu,
Zhiguang Chen,
Nong Xiao,
Yutong Lu
Abstract:
Most advances in medical image recognition supporting clinical auxiliary diagnosis meet challenges due to the low-resource situation in the medical field, where annotations are highly expensive and professional. This low-resource problem can be alleviated by leveraging the transferable representations of large-scale pre-trained vision-language models via relevant medical text prompts. However, exi…
▽ More
Most advances in medical image recognition supporting clinical auxiliary diagnosis meet challenges due to the low-resource situation in the medical field, where annotations are highly expensive and professional. This low-resource problem can be alleviated by leveraging the transferable representations of large-scale pre-trained vision-language models via relevant medical text prompts. However, existing pre-trained vision-language models require domain experts to carefully design the medical prompts, which greatly increases the burden on clinicians. To address this problem, we propose a weakly supervised prompt learning method MedPrompt to automatically generate medical prompts, which includes an unsupervised pre-trained vision-language model and a weakly supervised prompt learning model. The unsupervised pre-trained vision-language model utilizes the natural correlation between medical images and corresponding medical texts for pre-training, without any manual annotations. The weakly supervised prompt learning model only utilizes the classes of images in the dataset to guide the learning of the specific class vector in the prompt, while the learning of other context vectors in the prompt requires no manual annotations for guidance. To the best of our knowledge, this is the first model to automatically generate medical prompts. With these prompts, the pre-trained vision-language model can be freed from the strong expert dependency of manual annotation and manual prompt design. Experimental results show that the model using our automatically generated prompts outperforms its full-shot learning hand-crafted prompts counterparts with only a minimal number of labeled samples for few-shot learning, and reaches superior or comparable accuracy on zero-shot image classification. The proposed prompt generator is lightweight and therefore can be embedded into any network architecture.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Intensive Vision-guided Network for Radiology Report Generation
Authors:
Fudan Zheng,
Mengfei Li,
Ying Wang,
Weijiang Yu,
Ruixuan Wang,
Zhiguang Chen,
Nong Xiao,
Yutong Lu
Abstract:
Automatic radiology report generation is booming due to its huge application potential for the healthcare industry. However, existing computer vision and natural language processing approaches to tackle this problem are limited in two aspects. First, when extracting image features, most of them neglect multi-view reasoning in vision and model single-view structure of medical images, such as space-…
▽ More
Automatic radiology report generation is booming due to its huge application potential for the healthcare industry. However, existing computer vision and natural language processing approaches to tackle this problem are limited in two aspects. First, when extracting image features, most of them neglect multi-view reasoning in vision and model single-view structure of medical images, such as space-view or channel-view. However, clinicians rely on multi-view imaging information for comprehensive judgment in daily clinical diagnosis. Second, when generating reports, they overlook context reasoning with multi-modal information and focus on pure textual optimization utilizing retrieval-based methods. We aim to address these two issues by proposing a model that better simulates clinicians' perspectives and generates more accurate reports. Given the above limitation in feature extraction, we propose a Globally-intensive Attention (GIA) module in the medical image encoder to simulate and integrate multi-view vision perception. GIA aims to learn three types of vision perception: depth view, space view, and pixel view. On the other hand, to address the above problem in report generation, we explore how to involve multi-modal signals to generate precisely matched reports, i.e., how to integrate previously predicted words with region-aware visual content in next word prediction. Specifically, we design a Visual Knowledge-guided Decoder (VKGD), which can adaptively consider how much the model needs to rely on visual information and previously predicted text to assist next word prediction. Hence, our final Intensive Vision-guided Network (IVGN) framework includes a GIA-guided Visual Encoder and the VKGD. Experiments on two commonly-used datasets IU X-Ray and MIMIC-CXR demonstrate the superior ability of our method compared with other state-of-the-art approaches.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
Authors:
Xinyu Peng,
Ziyang Zheng,
Wenrui Dai,
Nuoqian Xiao,
Chenglin Li,
Junni Zou,
Hongkai Xiong
Abstract:
Recent diffusion models provide a promising zero-shot solution to noisy linear inverse problems without retraining for specific inverse problems. In this paper, we reveal that recent methods can be uniformly interpreted as employing a Gaussian approximation with hand-crafted isotropic covariance for the intractable denoising posterior to approximate the conditional posterior mean. Inspired by this…
▽ More
Recent diffusion models provide a promising zero-shot solution to noisy linear inverse problems without retraining for specific inverse problems. In this paper, we reveal that recent methods can be uniformly interpreted as employing a Gaussian approximation with hand-crafted isotropic covariance for the intractable denoising posterior to approximate the conditional posterior mean. Inspired by this finding, we propose to improve recent methods by using more principled covariance determined by maximum likelihood estimation. To achieve posterior covariance optimization without retraining, we provide general plug-and-play solutions based on two approaches specifically designed for leveraging pre-trained models with and without reverse covariance. We further propose a scalable method for learning posterior covariance prediction based on representation with orthonormal basis. Experimental results demonstrate that the proposed methods significantly enhance reconstruction performance without requiring hyperparameter tuning.
△ Less
Submitted 2 June, 2024; v1 submitted 3 February, 2024;
originally announced February 2024.
-
An Inexact Preconditioned Zeroth-order Proximal Method for Composite Optimization
Authors:
Shanglin Liu,
Lei Wang,
Nachuan Xiao,
Xin Liu
Abstract:
In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the differentiable part of the objective function are available. To efficiently solve this composite optimization problem, we propose a preconditioned zeroth-order proximal gra…
▽ More
In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the differentiable part of the objective function are available. To efficiently solve this composite optimization problem, we propose a preconditioned zeroth-order proximal gradient method in which the gradients and preconditioners are estimated by finite-difference schemes based on the function values at the same trial points. We establish the global convergence and worst-case complexity for our proposed method. Numerical experiments exhibit the superiority of our developed method.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts
Authors:
Yingpeng Wen,
Weijiang Yu,
Fudan Zheng,
Dan Huang,
Nong Xiao
Abstract:
Previous post-processing studies on rainfall forecasts using numerical weather prediction (NWP) mainly focus on statistics-based aspects, while learning-based aspects are rarely investigated. Although some manually-designed models are proposed to raise accuracy, they are customized networks, which need to be repeatedly tried and verified, at a huge cost in time and labor. Therefore, a self-supervi…
▽ More
Previous post-processing studies on rainfall forecasts using numerical weather prediction (NWP) mainly focus on statistics-based aspects, while learning-based aspects are rarely investigated. Although some manually-designed models are proposed to raise accuracy, they are customized networks, which need to be repeatedly tried and verified, at a huge cost in time and labor. Therefore, a self-supervised neural architecture search (NAS) method without significant manual efforts called AdaNAS is proposed in this study to perform rainfall forecast post-processing and predict rainfall with high accuracy. In addition, we design a rainfall-aware search space to significantly improve forecasts for high-rainfall areas. Furthermore, we propose a rainfall-level regularization function to eliminate the effect of noise data during the training. Validation experiments have been performed under the cases of \emph{None}, \emph{Light}, \emph{Moderate}, \emph{Heavy} and \emph{Violent} on a large-scale precipitation benchmark named TIGGE. Finally, the average mean-absolute error (MAE) and average root-mean-square error (RMSE) of the proposed AdaNAS model are 0.98 and 2.04 mm/day, respectively. Additionally, the proposed AdaNAS model is compared with other neural architecture search methods and previous studies. Compared results reveal the satisfactory performance and superiority of the proposed AdaNAS model in terms of precipitation amount prediction and intensity classification. Concretely, the proposed AdaNAS model outperformed previous best-performing manual methods with MAE and RMSE improving by 80.5\% and 80.3\%, respectively.
△ Less
Submitted 4 February, 2024; v1 submitted 26 December, 2023;
originally announced December 2023.
-
Liquid-shaped microlens for scalable production of ultrahigh-resolution OCT microendoscope
Authors:
Chao Xu,
Xin Guan,
Syeda Aimen Abbasi,
Neng Xia,
To Ngai,
Li Zhang,
Ho-Pui Ho,
Sze Hang Calvin Ng,
Wu Yuan
Abstract:
Endoscopic optical coherence tomography (OCT) is a valuable tool for providing diagnostic images of internal organs and guiding interventions in real time. Miniaturized OCT endoscopes are essential for imaging small and convoluted luminal organs while minimizing invasiveness. However, current methods for fabricating miniature fiber probes have limited ability to correct optical aberrations, leadin…
▽ More
Endoscopic optical coherence tomography (OCT) is a valuable tool for providing diagnostic images of internal organs and guiding interventions in real time. Miniaturized OCT endoscopes are essential for imaging small and convoluted luminal organs while minimizing invasiveness. However, current methods for fabricating miniature fiber probes have limited ability to correct optical aberrations, leading to suboptimal imaging performance. In this study, we introduce a new paradigm of liquid sha** technique for the rapid and scalable fabrication of ultrathin and high-performance OCT microendoscopes suitable for minimally invasive clinical applications. This technique enables the flexible customization of freeform microlenses with sub-nanometer optical surface roughness by regulating the minimum energy state of curable optical liquid on a wettability-modified substrate and precisely controlling the liquid volume and physical boundary on a substrate. Using this technique, we simultaneously fabricated 800-nm OCT microendoscopes with a diameter of approximately 0.6 mm and evaluated their ultrahigh-resolution imaging performance in the esophagus of rats and the aorta and brain of mice.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Adam-family Methods with Decoupled Weight Decay in Deep Learning
Authors:
Kuangyu Ding,
Nachuan Xiao,
Kim-Chuan Toh
Abstract:
In this paper, we investigate the convergence properties of a wide class of Adam-family methods for minimizing quadratically regularized nonsmooth nonconvex optimization problems, especially in the context of training nonsmooth neural networks with weight decay. Motivated by the AdamW method, we propose a novel framework for Adam-family methods with decoupled weight decay. Within our framework, th…
▽ More
In this paper, we investigate the convergence properties of a wide class of Adam-family methods for minimizing quadratically regularized nonsmooth nonconvex optimization problems, especially in the context of training nonsmooth neural networks with weight decay. Motivated by the AdamW method, we propose a novel framework for Adam-family methods with decoupled weight decay. Within our framework, the estimators for the first-order and second-order moments of stochastic subgradients are updated independently of the weight decay term. Under mild assumptions and with non-diminishing stepsizes for updating the primary optimization variables, we establish the convergence properties of our proposed framework. In addition, we show that our proposed framework encompasses a wide variety of well-known Adam-family methods, hence offering convergence guarantees for these methods in the training of nonsmooth neural networks. More importantly, we show that our proposed framework asymptotically approximates the SGD method, thereby providing an explanation for the empirical observation that decoupled weight decay enhances generalization performance for Adam-family methods. As a practical application of our proposed framework, we propose a novel Adam-family method named Adam with Decoupled Weight Decay (AdamD), and establish its convergence properties under mild conditions. Numerical experiments demonstrate that AdamD outperforms Adam and is comparable to AdamW, in the aspects of both generalization performance and efficiency.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
SGD-type Methods with Guaranteed Global Stability in Nonsmooth Nonconvex Optimization
Authors:
Nachuan Xiao,
Xiaoyin Hu,
Kim-Chuan Toh
Abstract:
In this paper, we focus on providing convergence guarantees for variants of the stochastic subgradient descent (SGD) method in minimizing nonsmooth nonconvex functions. We first develop a general framework to establish global stability for general stochastic subgradient methods, where the corresponding differential inclusion admits a coercive Lyapunov function. We prove that, with sufficiently sma…
▽ More
In this paper, we focus on providing convergence guarantees for variants of the stochastic subgradient descent (SGD) method in minimizing nonsmooth nonconvex functions. We first develop a general framework to establish global stability for general stochastic subgradient methods, where the corresponding differential inclusion admits a coercive Lyapunov function. We prove that, with sufficiently small stepsizes and controlled noises, the iterates asymptotically stabilize around the stable set of its corresponding differential inclusion. Then we introduce a scheme for develo** SGD-type methods with regularized update directions for the primal variables. Based on our developed framework, we prove the global stability of our proposed scheme under mild conditions. We further illustrate that our scheme yields variants of SGD-type methods, which enjoy guaranteed convergence in training nonsmooth neural networks. In particular, by employing the sign map to regularize the update directions, we propose a novel subgradient method named the Sign-map Regularized SGD method (SRSGD). Preliminary numerical experiments exhibit the high efficiency of SRSGD in training deep neural networks.
△ Less
Submitted 13 May, 2024; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Disambiguated Lexically Constrained Neural Machine Translation
Authors:
**peng Zhang,
Nini Xiao,
Ke Wang,
Chuanqi Dong,
Xiangyu Duan,
Yuqi Zhang,
Min Zhang
Abstract:
Lexically constrained neural machine translation (LCNMT), which controls the translation generation with pre-specified constraints, is important in many practical applications. Current approaches to LCNMT typically assume that the pre-specified lexical constraints are contextually appropriate. This assumption limits their application to real-world scenarios where a source lexicon may have multiple…
▽ More
Lexically constrained neural machine translation (LCNMT), which controls the translation generation with pre-specified constraints, is important in many practical applications. Current approaches to LCNMT typically assume that the pre-specified lexical constraints are contextually appropriate. This assumption limits their application to real-world scenarios where a source lexicon may have multiple target constraints, and disambiguation is needed to select the most suitable one. In this paper, we propose disambiguated LCNMT (D-LCNMT) to solve the problem. D-LCNMT is a robust and effective two-stage framework that disambiguates the constraints based on contexts at first, then integrates the disambiguated constraints into LCNMT. Experimental results show that our approach outperforms strong baselines including existing data augmentation based approaches on benchmark datasets, and comprehensive experiments in scenarios where a source lexicon corresponds to multiple target constraints demonstrate the constraint disambiguation superiority of our approach.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees
Authors:
Nachuan Xiao,
Xiaoyin Hu,
Xin Liu,
Kim-Chuan Toh
Abstract:
In this paper, we present a comprehensive study on the convergence properties of Adam-family methods for nonsmooth optimization, especially in the training of nonsmooth neural networks. We introduce a novel two-timescale framework that adopts a two-timescale updating scheme, and prove its convergence properties under mild assumptions. Our proposed framework encompasses various popular Adam-family…
▽ More
In this paper, we present a comprehensive study on the convergence properties of Adam-family methods for nonsmooth optimization, especially in the training of nonsmooth neural networks. We introduce a novel two-timescale framework that adopts a two-timescale updating scheme, and prove its convergence properties under mild assumptions. Our proposed framework encompasses various popular Adam-family methods, providing convergence guarantees for these methods in training nonsmooth neural networks. Furthermore, we develop stochastic subgradient methods that incorporate gradient clip** techniques for training nonsmooth neural networks with heavy-tailed noise. Through our framework, we show that our proposed methods converge even when the evaluation noises are only assumed to be integrable. Extensive numerical experiments demonstrate the high efficiency and robustness of our proposed methods.
△ Less
Submitted 19 February, 2024; v1 submitted 6 May, 2023;
originally announced May 2023.
-
A Riemannian Dimension-reduced Second Order Method with Application in Sensor Network Localization
Authors:
Tianyun Tang,
Kim-Chuan Toh,
Nachuan Xiao,
Yinyu Ye
Abstract:
In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the it…
▽ More
In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. We apply our method to solve a nonlinear formulation of the wireless sensor network localization problem whose feasible set is a Riemannian manifold that has not been considered in the literature before. Numerical experiments are conducted to verify the high efficiency of our algorithm compared to state-of-the-art Riemannian optimization methods and other nonlinear solvers.
△ Less
Submitted 24 April, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
A Partial Exact Penalty Function Approach for Constrained Optimization
Authors:
Nachuan Xiao,
Xin Liu,
Kim-Chuan Toh
Abstract:
In this paper, we focus on a class of constrained nonlinear optimization problems (NLP), where some of its equality constraints define a closed embedded submanifold $\mathcal{M}$ in $\mathbb{R}^n$. Although NLP can be solved directly by various existing approaches for constrained optimization in Euclidean space, these approaches usually fail to recognize the manifold structure of $\mathcal{M}$. To…
▽ More
In this paper, we focus on a class of constrained nonlinear optimization problems (NLP), where some of its equality constraints define a closed embedded submanifold $\mathcal{M}$ in $\mathbb{R}^n$. Although NLP can be solved directly by various existing approaches for constrained optimization in Euclidean space, these approaches usually fail to recognize the manifold structure of $\mathcal{M}$. To achieve better efficiency by utilizing the manifold structure of $\mathcal{M}$ in directly applying these existing optimization approaches, we propose a partial penalty function approach for NLP. In our proposed penalty function approach, we transform NLP into the corresponding constraint dissolving problem (CDP) in the Euclidean space, where the constraints that define $\mathcal{M}$ are eliminated through exact penalization. We establish the relationships on the constraint qualifications between NLP and CDP, and prove that NLP and CDP have the same stationary points and KKT points in a neighborhood of the feasible region under mild conditions. Therefore, various existing optimization approaches developed for constrained optimization in the Euclidean space can be directly applied to solve NLP through CDP. Preliminary numerical experiments demonstrate that by dissolving the constraints that define $\mathcal{M}$, CDP gains superior computational efficiency when compared to directly applying existing optimization approaches to solve NLP, especially in high dimensional scenarios.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Swimming of the midge larva: principles and tricks of locomotion at intermediate Reynolds number
Authors:
Bowen **,
Chengfeng Pan,
Neng Xia,
Jialei Song,
Haoxiang Luo,
Li Zhang,
Yang Ding
Abstract:
At the millimeter scale and in the intermediate Reynolds number (Re) regime, the midge and mosquito larvae can reach swimming speeds of more than one body length per cycle performing a "figure-of-8" gait, in which their elongated bodies periodically bend nearly into circles and then fully unfold. To elucidate the propulsion mechanism of this cycle of motion, we conducted a 3D numerical study which…
▽ More
At the millimeter scale and in the intermediate Reynolds number (Re) regime, the midge and mosquito larvae can reach swimming speeds of more than one body length per cycle performing a "figure-of-8" gait, in which their elongated bodies periodically bend nearly into circles and then fully unfold. To elucidate the propulsion mechanism of this cycle of motion, we conducted a 3D numerical study which investigates the hydrodynamics of undergoing the prescribed kinematics. Novel propulsion mechanisms, such as modulating the body deformation rate to dynamically increase the maximum net propulsion force, using asymmetric kinematics to generate torque and the appropriate rotation, and controlling the radius of the curled body to manipulate the moment of inertia. The figure-of-8 gait is found to achieve propulsion at a wide range of Re, but is most effective at intermediate Re. The results were further validated experimentally, via the development of a soft millimeter-sized robot that can reach comparable speeds using the figure-of-8 gait.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
CDOpt: A Python Package for a Class of Riemannian Optimization
Authors:
Nachuan Xiao,
Xiaoyin Hu,
Xin Liu,
Kim-Chuan Toh
Abstract:
Optimization over the embedded submanifold defined by constraints $c(x) = 0$ has attracted much interest over the past few decades due to its wide applications in various areas. Plenty of related optimization packages have been developed based on Riemannian optimization approaches, which rely on some basic geometrical materials of Riemannian manifolds, including retractions, vector transports, etc…
▽ More
Optimization over the embedded submanifold defined by constraints $c(x) = 0$ has attracted much interest over the past few decades due to its wide applications in various areas. Plenty of related optimization packages have been developed based on Riemannian optimization approaches, which rely on some basic geometrical materials of Riemannian manifolds, including retractions, vector transports, etc. These geometrical materials can be challenging to determine in general. Existing packages only accommodate a few well-known manifolds whose geometrical materials are easily accessible. For other manifolds which are not contained in these packages, the users have to develop the geometric materials by themselves. In addition, it is not always tractable to adopt advanced features from various state-of-the-art unconstrained optimization solvers to Riemannian optimization approaches.
We introduce CDOpt (available at https://cdopt.github.io/), a user-friendly Python package for a class Riemannian optimization. Based on constraint dissolving approaches, Riemannian optimization problems are transformed into their equivalent unconstrained counterparts in CDOpt. Therefore, solving Riemannian optimization problems through CDOpt directly benefits from various existing solvers and the rich expertise gained over decades for unconstrained optimization. Moreover, all the computations in CDOpt related to any manifold in question are conducted on its constraints expression, hence users can easily define new manifolds in CDOpt without any background on differential geometry. Furthermore, CDOpt extends the neural layers from PyTorch and Flax, thus allows users to train manifold constrained neural networks directly by the solvers for unconstrained optimization. Extensive numerical experiments demonstrate that CDOpt is highly efficient and robust in solving various classes of Riemannian optimization problems.
△ Less
Submitted 28 March, 2023; v1 submitted 5 December, 2022;
originally announced December 2022.
-
TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task
Authors:
Xin Ge,
Ke Wang,
Jiayi Wang,
Nini Xiao,
Xiangyu Duan,
Yu Zhao,
Yuqi Zhang
Abstract:
This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). We participate in the English-German and English-Chinese tasks. Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models, which has recently achieved great success. We choose FAIR's WMT19 English…
▽ More
This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). We participate in the English-German and English-Chinese tasks. Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models, which has recently achieved great success. We choose FAIR's WMT19 English-German news translation system and MBART50 for English-Chinese as our pre-trained models. Considering the task's condition of limited use of training data, we follow the data augmentation strategies proposed by WeTS to boost our TS model performance. The difference is that we further involve the dual conditional cross-entropy model and GPT-2 language model to filter augmented data. The leader board finally shows that our submissions are ranked first in three of four language directions in the Naive TS task of the WMT22 Translation Suggestion task.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
MapQA: A Dataset for Question Answering on Choropleth Maps
Authors:
Shuaichen Chang,
David Palzer,
Jialin Li,
Eric Fosler-Lussier,
Ningchuan Xiao
Abstract:
Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on huma…
▽ More
Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on human-generated charts (ChartQA), such as bar, line, and pie charts. However, little work has paid attention to understanding maps; general VQA models, and ChartQA models, suffer when asked to perform this task. To facilitate and encourage research in this area, we present MapQA, a large-scale dataset of ~800K question-answer pairs over ~60K map images. Our task tests various levels of map understanding, from surface questions about map styles to complex questions that require reasoning on the underlying data. We present the unique challenges of MapQA that frustrate most strong baseline algorithms designed for ChartQA and general VQA tasks. We also present a novel algorithm, Visual Multi-Output Data Extraction based QA (V-MODEQA) for MapQA. V-MODEQA extracts the underlying structured data from a map image with a multi-output model and then performs reasoning on the extracted data. Our experimental results show that V-MODEQA has better overall performance and robustness on MapQA than the state-of-the-art ChartQA and VQA algorithms by capturing the unique properties in map question answering.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
An Improved Unconstrained Approach for Bilevel Optimization
Authors:
Xiaoyin Hu,
Nachuan Xiao,
Xin Liu,
Kim-Chuan Toh
Abstract:
In this paper, we focus on the nonconvex-strongly-convex bilevel optimization problem (BLO). In this BLO, the objective function of the upper-level problem is nonconvex and possibly nonsmooth, and the lower-level problem is smooth and strongly convex with respect to the underlying variable $y$. We show that the feasible region of BLO is a Riemannian manifold. Then we transform BLO to its correspon…
▽ More
In this paper, we focus on the nonconvex-strongly-convex bilevel optimization problem (BLO). In this BLO, the objective function of the upper-level problem is nonconvex and possibly nonsmooth, and the lower-level problem is smooth and strongly convex with respect to the underlying variable $y$. We show that the feasible region of BLO is a Riemannian manifold. Then we transform BLO to its corresponding unconstrained constraint dissolving problem (CDB), whose objective function is explicitly formulated from the objective functions in BLO. We prove that BLO is equivalent to the unconstrained optimization problem CDB. Therefore, various efficient unconstrained approaches, together with their theoretical results, can be directly applied to BLO through CDB. We propose a unified framework for develo** subgradient-based methods for CDB. Remarkably, we show that several existing efficient algorithms can fit the unified framework and be interpreted as descent algorithms for CDB. These examples further demonstrate the great potential of our proposed approach.
△ Less
Submitted 23 December, 2022; v1 submitted 1 August, 2022;
originally announced August 2022.
-
Develo** synthetic individual-level population datasets: The case of contextualizing maps of privacy-preserving census data
Authors:
Yue Lin,
Ningchuan Xiao
Abstract:
The purpose of this paper is to describe the development of a synthetic population dataset that is open and realistic and can be used to facilitate understanding the cartographic process and contextualizing the cartographic artifacts. We first discuss an optimization model that is designed to construct the synthetic population by minimizing the difference between the summarized information of the…
▽ More
The purpose of this paper is to describe the development of a synthetic population dataset that is open and realistic and can be used to facilitate understanding the cartographic process and contextualizing the cartographic artifacts. We first discuss an optimization model that is designed to construct the synthetic population by minimizing the difference between the summarized information of the synthetic populations and the statistics published in census data tables. We then illustrate how the synthetic population dataset can be used to contextualize maps made using privacy-preserving census data. Two counties in Ohio are used as case studies.
△ Less
Submitted 1 April, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
A Constraint Dissolving Approach for Nonsmooth Optimization over the Stiefel Manifold
Authors:
Xiaoyin Hu,
Nachuan Xiao,
Xin Liu,
Kim-Chuan Toh
Abstract:
This paper focus on the minimization of a possibly nonsmooth objective function over the Stiefel manifold. The existing approaches either lack efficiency or can only tackle prox-friendly objective functions. We propose a constraint dissolving function named NCDF and show that it has the same first-order stationary points and local minimizers as the original problem in a neighborhood of the Stiefel…
▽ More
This paper focus on the minimization of a possibly nonsmooth objective function over the Stiefel manifold. The existing approaches either lack efficiency or can only tackle prox-friendly objective functions. We propose a constraint dissolving function named NCDF and show that it has the same first-order stationary points and local minimizers as the original problem in a neighborhood of the Stiefel manifold. Furthermore, we show that the Clarke subdifferential of NCDF is easy to achieve from the Clarke subdifferential of the objective function. Therefore, various existing approaches for unconstrained nonsmooth optimization can be directly applied to nonsmooth optimization problems over the Stiefel manifold. We propose a framework for develo** subgradient-based methods and establish their convergence properties based on prior works. Furthermore, based on our proposed framework, we can develop efficient approaches for optimization over the Stiefel manifold. Preliminary numerical experiments further highlight that the proposed constraint dissolving approach yields efficient and direct implementations of various unconstrained approaches to nonsmooth optimization problems over the Stiefel manifold.
△ Less
Submitted 20 January, 2023; v1 submitted 21 May, 2022;
originally announced May 2022.
-
AdMix: A Mixed Sample Data Augmentation Method for Neural Machine Translation
Authors:
Chang **,
Shigui Qiu,
Nini Xiao,
Hao Jia
Abstract:
In Neural Machine Translation (NMT), data augmentation methods such as back-translation have proven their effectiveness in improving translation performance. In this paper, we propose a novel data augmentation approach for NMT, which is independent of any additional training data. Our approach, AdMix, consists of two parts: 1) introduce faint discrete noise (word replacement, word drop**, word s…
▽ More
In Neural Machine Translation (NMT), data augmentation methods such as back-translation have proven their effectiveness in improving translation performance. In this paper, we propose a novel data augmentation approach for NMT, which is independent of any additional training data. Our approach, AdMix, consists of two parts: 1) introduce faint discrete noise (word replacement, word drop**, word swap**) into the original sentence pairs to form augmented samples; 2) generate new synthetic training data by softly mixing the augmented samples with their original samples in training corpus. Experiments on three translation datasets of different scales show that AdMix achieves signifi cant improvements (1.0 to 2.7 BLEU points) over strong Transformer baseline. When combined with other data augmentation techniques (e.g., back-translation), our approach can obtain further improvements.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
Dissolving Constraints for Riemannian Optimization
Authors:
Nachuan Xiao,
Xin Liu,
Kim-Chuan Toh
Abstract:
In this paper, we consider optimization problems over closed embedded submanifolds of $\mathbb{R}^n$, which are defined by the constraints $c(x) = 0$. We propose a class of constraint dissolving approaches for these Riemannian optimization problems. In these proposed approaches, solving a Riemannian optimization problem is transferred into the unconstrained minimization of a constraint dissolving…
▽ More
In this paper, we consider optimization problems over closed embedded submanifolds of $\mathbb{R}^n$, which are defined by the constraints $c(x) = 0$. We propose a class of constraint dissolving approaches for these Riemannian optimization problems. In these proposed approaches, solving a Riemannian optimization problem is transferred into the unconstrained minimization of a constraint dissolving function named CDF. Different from existing exact penalty functions, the exact gradient and Hessian of CDF are easy to compute. We study the theoretical properties of CDF and prove that the original problem and CDF have the same first-order and second-order stationary points, local minimizers, and Łojasiewicz exponents in a neighborhood of the feasible region. Remarkably, the convergence properties of our proposed constraint dissolving approaches can be directly inherited from the existing rich results in unconstrained optimization. Therefore, the proposed constraint dissolving approaches build up short cuts from unconstrained optimization to Riemannian optimization. Several illustrative examples further demonstrate the potential of our proposed constraint dissolving approaches.
△ Less
Submitted 14 October, 2022; v1 submitted 19 March, 2022;
originally announced March 2022.
-
Solving Optimization Problems over the Stiefel Manifold by Smooth Exact Penalty Function
Authors:
Nachuan Xiao,
Xin Liu
Abstract:
In this paper, we present a novel penalty model called ExPen for optimization over the Stiefel manifold. Different from existing penalty functions for orthogonality constraints, ExPen adopts a smooth penalty function without using any first-order derivative of the objective function. We show that all the first-order stationary points of ExPen with a sufficiently large penalty parameter are either…
▽ More
In this paper, we present a novel penalty model called ExPen for optimization over the Stiefel manifold. Different from existing penalty functions for orthogonality constraints, ExPen adopts a smooth penalty function without using any first-order derivative of the objective function. We show that all the first-order stationary points of ExPen with a sufficiently large penalty parameter are either feasible, namely, are the first-order stationary points of the original optimization problem, or far from the Stiefel manifold. Besides, the original problem and ExPen share the same second-order stationary points. Remarkably, the exact gradient and Hessian of ExPen are easy to compute. As a consequence, abundant algorithm resources in unconstrained optimization can be applied straightforwardly to solve ExPen.
△ Less
Submitted 18 December, 2022; v1 submitted 17 October, 2021;
originally announced October 2021.
-
Atomic structure, electronic structure and optical absorption of inorganic perovskite compounds Cs2SnI6-nXn (X=F, Cl, Br; n= 0~6): A first-principles study
Authors:
Wang Xuan,
Tang Yehua,
Nairui Xiao,
Wang Ke-Fan
Abstract:
As a possible alternative to organic-inorganic hybrid perovskite halide, inorganic Cs2SnI6 has drawn more and more research attention recently. In order to find more Cs2SnI6 derivatives as the potential solar cell absorber materials, I- ions in Cs2SnI6 are replaced by other halogen ions and forms the Cs2SnI6-nXn (X=F, Cl, Br; n=1~6) compounds, whose atomic structures, electronic structures and opt…
▽ More
As a possible alternative to organic-inorganic hybrid perovskite halide, inorganic Cs2SnI6 has drawn more and more research attention recently. In order to find more Cs2SnI6 derivatives as the potential solar cell absorber materials, I- ions in Cs2SnI6 are replaced by other halogen ions and forms the Cs2SnI6-nXn (X=F, Cl, Br; n=1~6) compounds, whose atomic structures, electronic structures and optical absorption are investigated by first principles calculation. When the alloying level n increases, the mean lattice constants, the weighted Sn-X and Cs-X bond lengths all decreases linearly; the bond length of each Sn-X diminishes slightly inside the octahedral structure; Eg of Cs2SnI6-nXn increases nonlinearly. Eleven Cs2SnI6-nXn compounds have an Eg between 1.0 eV and 2.0 eV and so can be potentially used as the light absorption layer of solar cells. Their partial DOS demonstrate that as the alloying level n increases, I 5p orbital in VBM and CBM is gradually substituted by Br 4p, or Cl 3p, or F 2p orbital. The eleven Cs2SnI6-nXn alloys all have a direct bandgap although the lattice distortion induced by the alloyed X- ion.
△ Less
Submitted 8 August, 2021;
originally announced August 2021.
-
Hybrid Reasoning Network for Video-based Commonsense Captioning
Authors:
Weijiang Yu,
Jian Liang,
Lei Ji,
Lu Li,
Yuejian Fang,
Nong Xiao,
Nan Duan
Abstract:
The task of video-based commonsense captioning aims to generate event-wise captions and meanwhile provide multiple commonsense descriptions (e.g., attribute, effect and intention) about the underlying event in the video. Prior works explore the commonsense captions by using separate networks for different commonsense types, which is time-consuming and lacks mining the interaction of different comm…
▽ More
The task of video-based commonsense captioning aims to generate event-wise captions and meanwhile provide multiple commonsense descriptions (e.g., attribute, effect and intention) about the underlying event in the video. Prior works explore the commonsense captions by using separate networks for different commonsense types, which is time-consuming and lacks mining the interaction of different commonsense. In this paper, we propose a Hybrid Reasoning Network (HybridNet) to endow the neural networks with the capability of semantic-level reasoning and word-level reasoning. Firstly, we develop multi-commonsense learning for semantic-level reasoning by jointly training different commonsense types in a unified network, which encourages the interaction between the clues of multiple commonsense descriptions, event-wise captions and videos. Then, there are two steps to achieve the word-level reasoning: (1) a memory module records the history predicted sequence from the previous generation processes; (2) a memory-routed multi-head attention (MMHA) module updates the word-level attention maps by incorporating the history information from the memory module into the transformer decoder for word-level reasoning. Moreover, the multimodal features are used to make full use of diverse knowledge for commonsense reasoning. Experiments and abundant analysis on the large-scale Video-to-Commonsense benchmark show that our HybridNet achieves state-of-the-art performance compared with other methods.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Identifying High Accuracy Regions in Traffic Camera Images to Enhance the Estimation of Road Traffic Metrics: A Quadtree-Based Method
Authors:
Yue Lin,
Ningchuan Xiao
Abstract:
The growing number of real-time camera feeds in urban areas has made it possible to provide high-quality traffic data for effective transportation planning, operations, and management. However, deriving reliable traffic metrics from these camera feeds has been a challenge due to the limitations of current vehicle detection techniques, as well as the various camera conditions such as height and res…
▽ More
The growing number of real-time camera feeds in urban areas has made it possible to provide high-quality traffic data for effective transportation planning, operations, and management. However, deriving reliable traffic metrics from these camera feeds has been a challenge due to the limitations of current vehicle detection techniques, as well as the various camera conditions such as height and resolution. In this work, a quadtree based algorithm is developed to continuously partition the image extent until only regions with high detection accuracy are remained. These regions are referred to as the high-accuracy identification regions (HAIR) in this paper. We demonstrate how the use of the HAIR can improve the accuracy of traffic density estimates using images from traffic cameras at different heights and resolutions in Central Ohio. Our experiments show that the proposed algorithm can be used to derive robust HAIR where vehicle detection accuracy is 41 percent higher than that in the original image extent. The use of the HAIR also significantly improves the traffic density estimation with an overall decrease of 49 percent in root mean squared error.
△ Less
Submitted 14 June, 2022; v1 submitted 26 June, 2021;
originally announced June 2021.
-
Combining Static Word Embeddings and Contextual Representations for Bilingual Lexicon Induction
Authors:
**peng Zhang,
Baijun Ji,
Nini Xiao,
Xiangyu Duan,
Min Zhang,
Yangbin Shi,
Weihua Luo
Abstract:
Bilingual Lexicon Induction (BLI) aims to map words in one language to their translations in another, and is typically through learning linear projections to align monolingual word representation spaces. Two classes of word representations have been explored for BLI: static word embeddings and contextual representations, but there is no studies to combine both. In this paper, we propose a simple y…
▽ More
Bilingual Lexicon Induction (BLI) aims to map words in one language to their translations in another, and is typically through learning linear projections to align monolingual word representation spaces. Two classes of word representations have been explored for BLI: static word embeddings and contextual representations, but there is no studies to combine both. In this paper, we propose a simple yet effective mechanism to combine the static word embeddings and the contextual representations to utilize the advantages of both paradigms. We test the combination mechanism on various language pairs under the supervised and unsupervised BLI benchmark settings. Experiments show that our mechanism consistently improves performances over robust BLI baselines on all language pairs by averagely improving 3.2 points in the supervised setting, and 3.1 points in the unsupervised setting.
△ Less
Submitted 10 June, 2021; v1 submitted 6 June, 2021;
originally announced June 2021.
-
AGMB-Transformer: Anatomy-Guided Multi-Branch Transformer Network for Automated Evaluation of Root Canal Therapy
Authors:
Yunxiang Li,
Guodong Zeng,
Yifan Zhang,
Jun Wang,
Qianni Zhang,
Qun **,
Lingling Sun,
Qisi Lian,
Neng Xia,
Ruizi Peng,
Kai Tang,
Yaqi Wang,
Shuai Wang
Abstract:
Accurate evaluation of the treatment result on X-ray images is a significant and challenging step in root canal therapy since the incorrect interpretation of the therapy results will hamper timely follow-up which is crucial to the patients' treatment outcome. Nowadays, the evaluation is performed in a manual manner, which is time-consuming, subjective, and error-prone. In this paper, we aim to aut…
▽ More
Accurate evaluation of the treatment result on X-ray images is a significant and challenging step in root canal therapy since the incorrect interpretation of the therapy results will hamper timely follow-up which is crucial to the patients' treatment outcome. Nowadays, the evaluation is performed in a manual manner, which is time-consuming, subjective, and error-prone. In this paper, we aim to automate this process by leveraging the advances in computer vision and artificial intelligence, to provide an objective and accurate method for root canal therapy result assessment. A novel anatomy-guided multi-branch Transformer (AGMB-Transformer) network is proposed, which first extracts a set of anatomy features and then uses them to guide a multi-branch Transformer network for evaluation. Specifically, we design a polynomial curve fitting segmentation strategy with the help of landmark detection to extract the anatomy features. Moreover, a branch fusion module and a multi-branch structure including our progressive Transformer and Group Multi-Head Self-Attention (GMHSA) are designed to focus on both global and local features for an accurate diagnosis. To facilitate the research, we have collected a large-scale root canal therapy evaluation dataset with 245 root canal therapy X-ray images, and the experiment results show that our AGMB-Transformer can improve the diagnosis accuracy from 57.96% to 90.20% compared with the baseline network. The proposed AGMB-Transformer can achieve a highly accurate evaluation of root canal therapy. To our best knowledge, our work is the first to perform automatic root canal therapy evaluation and has important clinical value to reduce the workload of endodontists.
△ Less
Submitted 28 October, 2021; v1 submitted 1 May, 2021;
originally announced May 2021.
-
Dynamic Weighted Learning for Unsupervised Domain Adaptation
Authors:
Ni Xiao,
Lei Zhang
Abstract:
Unsupervised domain adaptation (UDA) aims to improve the classification performance on an unlabeled target domain by leveraging information from a fully labeled source domain. Recent approaches explore domain-invariant and class-discriminant representations to tackle this task. These methods, however, ignore the interaction between domain alignment learning and class discrimination learning. As a…
▽ More
Unsupervised domain adaptation (UDA) aims to improve the classification performance on an unlabeled target domain by leveraging information from a fully labeled source domain. Recent approaches explore domain-invariant and class-discriminant representations to tackle this task. These methods, however, ignore the interaction between domain alignment learning and class discrimination learning. As a result, the missing or inadequate tradeoff between domain alignment and class discrimination are prone to the problem of negative transfer. In this paper, we propose Dynamic Weighted Learning (DWL) to avoid the discriminability vanishing problem caused by excessive alignment learning and domain misalignment problem caused by excessive discriminant learning. Technically, DWL dynamically weights the learning losses of alignment and discriminability by introducing the degree of alignment and discriminability. Besides, the problem of sample imbalance across domains is first considered in our work, and we solve the problem by weighing the samples to guarantee information balance across domains. Extensive experiments demonstrate that DWL has an excellent performance in several benchmark datasets.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
A Penalty-free Infeasible Approach for a Class of Nonsmooth Optimization Problems over the Stiefel Manifold
Authors:
Nachuan Xiao,
Xin Liu,
Ya-xiang Yuan
Abstract:
Transforming into an exact penalty function model with convex compact constraints yields efficient infeasible approaches for optimization problems with orthogonality constraints. For smooth and $\ell_{2,1}$-norm regularized cases, these infeasible approaches adopt simple and orthonormalization-free updating scheme and show their high efficiency in the test examples. However, to avoid orthonormaliz…
▽ More
Transforming into an exact penalty function model with convex compact constraints yields efficient infeasible approaches for optimization problems with orthogonality constraints. For smooth and $\ell_{2,1}$-norm regularized cases, these infeasible approaches adopt simple and orthonormalization-free updating scheme and show their high efficiency in the test examples. However, to avoid orthonormalization while enforcing the feasibility of the final solution, these infeasible approaches introduce a quadratic penalty term, where an inappropriate penalty parameter can lead to numerical inefficiency. Inspired by penalty-free approaches for smooth optimization problems, we proposed a proximal first-order algorithm for a class of optimization problems with orthogonality constraints and nonsmooth regularization term. The consequent algorithm, named sequential linearized proximal gradient method (SLPG), alternatively takes tangential steps and normal steps to improve the optimality and feasibility respectively. In SLPG, the orthonormalization process is invoked only once at the last step if high precision in feasibility is needed, showing that main iterations in SLPG are orthonormalization-free. Besides, both the tangential steps and normal steps do not involve the penalty parameter, and thus SLPG is penalty-free and avoids the inefficiency by inappropriate penalty parameter. We analyze the global convergence properties of SLPG where the tangential steps are inexactly computed. By inexactly computing tangential steps, for smooth cases and $\ell_{2,1}$-norm regularized cases, SLPG has a closed-form updating scheme, which leads to its cheap tangential steps. Numerical experiments illustrate the numerical advantages of SLPG when compared with existing first-order methods.
△ Less
Submitted 28 March, 2021; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction
Authors:
Xuanzhao Wang,
Zheng** Che,
Bo Jiang,
Ning Xiao,
Ke Yang,
Jian Tang,
Jie** Ye,
**gyu Wang,
Qi Qi
Abstract:
Video anomaly detection is commonly used in many applications such as security surveillance and is very challenging.A majority of recent video anomaly detection approaches utilize deep reconstruction models, but their performance is often suboptimal because of insufficient reconstruction error differences between normal and abnormal video frames in practice. Meanwhile, frame prediction-based anoma…
▽ More
Video anomaly detection is commonly used in many applications such as security surveillance and is very challenging.A majority of recent video anomaly detection approaches utilize deep reconstruction models, but their performance is often suboptimal because of insufficient reconstruction error differences between normal and abnormal video frames in practice. Meanwhile, frame prediction-based anomaly detection methods have shown promising performance. In this paper, we propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design which is more in line with the characteristics of surveillance videos. The proposed method is equipped with a multi-path ConvGRU-based frame prediction network that can better handle semantically informative objects and areas of different scales and capture spatial-temporal dependencies in normal videos. A noise tolerance loss is introduced during training to mitigate the interference caused by background noise. Extensive experiments have been conducted on the CUHK Avenue, ShanghaiTech Campus, and UCSD Pedestrian datasets, and the results show that our proposed method outperforms existing state-of-the-art approaches. Remarkably, our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
△ Less
Submitted 27 May, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
ElixirNet: Relation-aware Network Architecture Adaptation for Medical Lesion Detection
Authors:
Chenhan Jiang,
Shaoju Wang,
Hang Xu,
Xiaodan Liang,
Nong Xiao
Abstract:
Most advances in medical lesion detection network are limited to subtle modification on the conventional detection network designed for natural images. However, there exists a vast domain gap between medical images and natural images where the medical image detection often suffers from several domain-specific challenges, such as high lesion/background similarity, dominant tiny lesions, and severe…
▽ More
Most advances in medical lesion detection network are limited to subtle modification on the conventional detection network designed for natural images. However, there exists a vast domain gap between medical images and natural images where the medical image detection often suffers from several domain-specific challenges, such as high lesion/background similarity, dominant tiny lesions, and severe class imbalance. Is a hand-crafted detection network tailored for natural image undoubtedly good enough over a discrepant medical lesion domain? Is there more powerful operations, filters, and sub-networks that better fit the medical lesion detection problem to be discovered? In this paper, we introduce a novel ElixirNet that includes three components: 1) TruncatedRPN balances positive and negative data for false positive reduction; 2) Auto-lesion Block is automatically customized for medical images to incorporate relation-aware operations among region proposals, and leads to more suitable and efficient classification and localization. 3) Relation transfer module incorporates the semantic relationship and transfers the relevant contextual information with an interpretable the graph thus alleviates the problem of lack of annotations for all types of lesions. Experiments on DeepLesion and Kits19 prove the effectiveness of ElixirNet, achieving improvement of both sensitivity and precision over FPN with fewer parameters.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
The Rockerverse: Packages and Applications for Containerization with R
Authors:
Daniel Nüst,
Dirk Eddelbuettel,
Dom Bennett,
Robrecht Cannoodt,
Dav Clark,
Gergely Daroczi,
Mark Edmondson,
Colin Fay,
Ellis Hughes,
Lars Kjeldgaard,
Sean Lopp,
Ben Marwick,
Heather Nolis,
Jacqueline Nolis,
Hong Ooi,
Karthik Ram,
Noam Ross,
Lori Shepherd,
Péter Sólymos,
Tyson Lee Swetnam,
Nitesh Turaga,
Charlotte Van Petegem,
Jason Williams,
Craig Willis,
Nan Xiao
Abstract:
The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-ba…
▽ More
The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-based data processing, and production deployment of services. The variety of applications demonstrates the power of the Rocker Project specifically and containerisation in general. Across the diverse ways to use containers, we identified common themes: reproducible environments, scalability and efficiency, and portability across clouds. We conclude that the current growth and diversification of use cases is likely to continue its positive impact, but see the need for consolidating the Rockerverse ecosystem of packages, develo** common practices for applications, and exploring alternative containerisation software.
△ Less
Submitted 17 August, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Physics driven reduced order model for real time blood flow simulations
Authors:
Sethuraman Sankaran,
David Lesage,
Rhea Tombropoulos,
Nan Xiao,
Hyun ** Kim,
David Spain,
Michiel Schaap,
Charles A. Taylor
Abstract:
Predictive modeling of blood flow and pressure have numerous applications ranging from non-invasive assessment of functional significance of disease to planning invasive procedures. While several such predictive modeling techniques have been proposed, their use in the clinic has been limited due in part to the significant time required to perform virtual interventions and compute the resultant cha…
▽ More
Predictive modeling of blood flow and pressure have numerous applications ranging from non-invasive assessment of functional significance of disease to planning invasive procedures. While several such predictive modeling techniques have been proposed, their use in the clinic has been limited due in part to the significant time required to perform virtual interventions and compute the resultant changes in hemodynamic conditions. We propose a fast hemodynamic assessment method based on first constructing an exploration space of geometries, tailored to each patient, and subsequently building a physics driven reduced order model in this space. We demonstrate that this method can predict fractional flow reserve derived from coronary computed tomography angiography in response to changes to a patient-specific lumen geometry in real time while achieving high accuracy when compared to computational fluid dynamics simulations. We validated this method on over 1300 patients that received a coronary CT scan and demonstrated a correlation coefficient of 0.98 with an error of 0.005 +- 0.015 (95% confidence interval: (-0.020, 0.031)) as compared to three-dimensional blood flow calculations.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Heterogeneous Graph Learning for Visual Commonsense Reasoning
Authors:
Weijiang Yu,
**gwen Zhou,
Weihao Yu,
Xiaodan Liang,
Nong Xiao
Abstract:
Visual commonsense reasoning task aims at leading the research field into solving cognition-level reasoning with the ability of predicting correct answers and meanwhile providing convincing reasoning paths, resulting in three sub-tasks i.e., Q->A, QA->R and Q->AR. It poses great challenges over the proper semantic alignment between vision and linguistic domains and knowledge reasoning to generate…
▽ More
Visual commonsense reasoning task aims at leading the research field into solving cognition-level reasoning with the ability of predicting correct answers and meanwhile providing convincing reasoning paths, resulting in three sub-tasks i.e., Q->A, QA->R and Q->AR. It poses great challenges over the proper semantic alignment between vision and linguistic domains and knowledge reasoning to generate persuasive reasoning paths. Existing works either resort to a powerful end-to-end network that cannot produce interpretable reasoning paths or solely explore intra-relationship of visual objects (homogeneous graph) while ignoring the cross-domain semantic alignment among visual concepts and linguistic words. In this paper, we propose a new Heterogeneous Graph Learning (HGL) framework for seamlessly integrating the intra-graph and inter-graph reasoning in order to bridge vision and language domain. Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement. Moreover, our HGL integrates a contextual voting module to exploit a long-range visual context for better global reasoning. Experiments on the large-scale Visual Commonsense Reasoning benchmark demonstrate the superior performance of our proposed modules on three tasks (improving 5% accuracy on Q->A, 3.5% on QA->R, 5.8% on Q->AR)
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Layout-Graph Reasoning for Fashion Landmark Detection
Authors:
Weijiang Yu,
Xiaodan Liang,
Ke Gong,
Chenhan Jiang,
Nong Xiao,
Liang Lin
Abstract:
Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this pa…
▽ More
Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this paper, we propose to seamlessly enforce structural layout relationships among landmarks on the intermediate representations via multiple stacked layout-graph reasoning layers. We define the layout-graph as a hierarchical structure including a root node, body-part nodes (e.g. upper body, lower body), coarse clothes-part nodes (e.g. collar, sleeve) and leaf landmark nodes (e.g. left-collar, right-collar). Each Layout-Graph Reasoning(LGR) layer aims to map feature representations into structural graph nodes via a Map-to-Node module, performs reasoning over structural graph nodes to achieve global layout coherency via a layout-graph reasoning module, and then maps graph nodes back to enhance feature representations via a Node-to-Map module. The layout-graph reasoning module integrates a graph clustering operation to generate representations of intermediate nodes (bottom-up inference) and then a graph deconvolution operation (top-down inference) over the whole graph. Extensive experiments on two public fashion landmark datasets demonstrate the superiority of our model. Furthermore, to advance the fine-grained fashion landmark research for supporting more comprehensive clothes generation and attribute recognition, we contribute the first Fine-grained Fashion Landmark Dataset (FFLD) containing 200k images annotated with at most 32 key-points for 13 clothes types.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
Gradual Network for Single Image De-raining
Authors:
Zhe Huang,
Weijiang Yu,
Wayne Zhang,
Litong Feng,
Nong Xiao
Abstract:
Most advances in single image de-raining meet a key challenge, which is removing rain streaks with different scales and shapes while preserving image details. Existing single image de-raining approaches treat rain-streak removal as a process of pixel-wise regression directly. However, they are lacking in mining the balance between over-de-raining (e.g. removing texture details in rain-free regions…
▽ More
Most advances in single image de-raining meet a key challenge, which is removing rain streaks with different scales and shapes while preserving image details. Existing single image de-raining approaches treat rain-streak removal as a process of pixel-wise regression directly. However, they are lacking in mining the balance between over-de-raining (e.g. removing texture details in rain-free regions) and under-de-raining (e.g. leaving rain streaks). In this paper, we firstly propose a coarse-to-fine network called Gradual Network (GraNet) consisting of coarse stage and fine stage for delving into single image de-raining with different granularities. Specifically, to reveal coarse-grained rain-streak characteristics (e.g. long and thick rain streaks/raindrops), we propose a coarse stage by utilizing local-global spatial dependencies via a local-global subnetwork composed of region-aware blocks. Taking the residual result (the coarse de-rained result) between the rainy image sample (i.e. the input data) and the output of coarse stage (i.e. the learnt rain mask) as input, the fine stage continues to de-rain by removing the fine-grained rain streaks (e.g. light rain streaks and water mist) to get a rain-free and well-reconstructed output image via a unified contextual merging sub-network with dense blocks and a merging block. Solid and comprehensive experiments on synthetic and real data demonstrate that our GraNet can significantly outperform the state-of-the-art methods by removing rain streaks with various densities, scales and shapes while kee** the image details of rain-free regions well-preserved.
△ Less
Submitted 20 September, 2019;
originally announced September 2019.
-
On the estimation of high-dimensional integrated covariance matrix based on high-frequency data with multiple transactions
Authors:
Moming Wang,
Ningning Xia,
You Zhou
Abstract:
Due to the mechanism of recording, the presence of multiple transactions at each recording time becomes a common feature for high-frequency data in financial market. Using random matrix theory, this paper considers the estimation of integrated covariance (ICV) matrices of high-dimensional diffusion processes based on multiple high-frequency observations. We start by studying the estimator, the tim…
▽ More
Due to the mechanism of recording, the presence of multiple transactions at each recording time becomes a common feature for high-frequency data in financial market. Using random matrix theory, this paper considers the estimation of integrated covariance (ICV) matrices of high-dimensional diffusion processes based on multiple high-frequency observations. We start by studying the estimator, the time-variation adjusted realized covariance (TVA) matrix, proposed in Zheng and Li (2011) without microstructure noise. We show that in the high-dimensional case, for a class C of diffusion processes, the limiting spectral distribution (LSD) of averaged TVA depends not only on that of ICV, but also on the numbers of multiple transactions at each recording time. However, in practice, the observed prices are always contaminated by the market microstructure noise. Thus the limiting behavior of pre-averaging averaged TVA matrices is studied based on the noisy multiple observations. We show that for processes in class C, the pre-averaging averaged TVA has desirable properties that it eliminates the effects of microstructure noise and multiple transactions, and its LSD depends solely on that of the ICV matrix. Further, three types of nonlinear shrinkage estimators of ICV are proposed based on high-frequency noisy multiple observations. Simulation studies support our theoretical results and show the finite sample performance of the proposed estimators. At last, the high-frequency portfolio strategies are evaluated under these estimators in real data analysis.
△ Less
Submitted 5 September, 2019; v1 submitted 23 August, 2019;
originally announced August 2019.
-
Judging Chemical Reaction Practicality From Positive Sample only Learning
Authors:
Shu Jiang,
Zhuosheng Zhang,
Hai Zhao,
Jiangtong Li,
Yang Yang,
Bao-Liang Lu,
Ning Xia
Abstract:
Chemical reaction practicality is the core task among all symbol intelligence based chemical information processing, for example, it provides indispensable clue for further automatic synthesis route inference. Considering that chemical reactions have been represented in a language form, we propose a new solution to generally judge the practicality of organic reaction without considering complex qu…
▽ More
Chemical reaction practicality is the core task among all symbol intelligence based chemical information processing, for example, it provides indispensable clue for further automatic synthesis route inference. Considering that chemical reactions have been represented in a language form, we propose a new solution to generally judge the practicality of organic reaction without considering complex quantum physical modeling or chemistry knowledge. While tackling the practicality judgment as a machine learning task from positive and negative (chemical reaction) samples, all existing studies have to carefully handle the serious insufficiency issue on the negative samples. We propose an auto-construction method to well solve the extensively existed long-term difficulty. Experimental results show our model can effectively predict the practicality of chemical reactions, which achieves a high accuracy of 99.76\% on real large-scale chemical lab reaction practicality judgment.
△ Less
Submitted 22 April, 2019;
originally announced April 2019.
-
Cross-Modal Attentional Context Learning for RGB-D Object Detection
Authors:
Guanbin Li,
Yukang Gan,
Hejun Wu,
Nong Xiao,
Liang Lin
Abstract:
Recognizing objects from simultaneously sensed photometric (RGB) and depth channels is a fundamental yet practical problem in many machine vision applications such as robot gras** and autonomous driving. In this paper, we address this problem by develo** a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and…
▽ More
Recognizing objects from simultaneously sensed photometric (RGB) and depth channels is a fundamental yet practical problem in many machine vision applications such as robot gras** and autonomous driving. In this paper, we address this problem by develo** a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data. Compared to existing RGB-D object detection frameworks, our approach has several appealing properties. First, it consists of an attention-based global context model for exploiting adaptive contextual information and incorporating this information into a region-based CNN (e.g., Fast RCNN) framework to achieve improved object detection performance. Second, our CMAC framework further contains a fine-grained object part attention module to harness multiple discriminative object parts inside each possible object region for superior local feature representation. While greatly improving the accuracy of RGB-D object detection, the effective cross-modal information fusion as well as attentional context modeling in our proposed model provide an interpretable visualization scheme. Experimental results demonstrate that the proposed method significantly improves upon the state of the art on all public benchmarks.
△ Less
Submitted 30 October, 2018;
originally announced October 2018.
-
From Communication to Sensing : Recognizing and Counting Repetitive Motions with Wireless Backscattering
Authors:
Ning Xiao,
Panlong Yang,
Yubo Yan,
Hao Zhou,
Xiang-Yang Li,
Haohua Du
Abstract:
Recently several ground-breaking RF-based motion recognition systems were proposed to detect and/or recognize macro/micro human movements. These systems often suffer from various interferences caused by multiple-users moving simultaneously, resulting in extremely low recognition accuracy. To tackle this challenge, we propose a novel system, called Motion-Fi, which marries battery free wireless bac…
▽ More
Recently several ground-breaking RF-based motion recognition systems were proposed to detect and/or recognize macro/micro human movements. These systems often suffer from various interferences caused by multiple-users moving simultaneously, resulting in extremely low recognition accuracy. To tackle this challenge, we propose a novel system, called Motion-Fi, which marries battery free wireless backscattering and device-free sensing. Motion-Fi is an accurate, interference tolerable motion-recognition system, which counts repetitive motions without using scenario-dependent templates or profiles and enables multi-users performing certain motions simultaneously because of the relatively short transmission range of backscattered signals. Although the repetitive motions are fairly well detectable through the backscattering signals in theory, in reality they get blended into various other system noises during the motion. Moreover, irregular motion patterns among users will lead to expensive computation cost for motion recognition. We build a backscattering wireless platform to validate our design in various scenarios for over 6 months when different persons, distances and orientations are incorporated. In our experiments, the periodicity in motions could be recognized without any learning or training process, and the accuracy of counting such motions can be achieved within 5% count error. With little efforts in learning the patterns, our method could achieve 93.1% motion-recognition accuracy for a variety of motions. Moreover, by leveraging the periodicity of motions, the recognition accuracy could be further improved to nearly 100% with only 3 repetitions. Our experiments also show that the motions of multiple persons separating by around 2 meters cause little accuracy reduction in the counting process.
△ Less
Submitted 27 October, 2018;
originally announced October 2018.
-
COUSTIC: Combinatorial Double auction for Task Assignment in Device-to-Device Clouds
Authors:
Yutong Zhai,
Liusheng Huang,
Long Chen,
Ning Xiao,
Yangyang Geng
Abstract:
With the emerging technologies of Internet of Things (IOTs), the capabilities of mobile devices have increased tremendously. However, in the big data era, to complete tasks on one device is still challenging. As an emerging technology, crowdsourcing utilizing crowds of devices to facilitate large scale sensing tasks has gaining more and more research attention. Most of existing works either assume…
▽ More
With the emerging technologies of Internet of Things (IOTs), the capabilities of mobile devices have increased tremendously. However, in the big data era, to complete tasks on one device is still challenging. As an emerging technology, crowdsourcing utilizing crowds of devices to facilitate large scale sensing tasks has gaining more and more research attention. Most of existing works either assume devices are willing to cooperate utilizing centralized mechanisms or design incentive algorithms using double auctions. Which is not practical to deal with the case when there is a lack of centralized controller for the former, and not suitable to the case when the seller device is also resource constrained for the later. In this paper, we propose a truthful incentive mechanism with combinatorial double auction for crowd sensing task assignment in device-to-device (D2D) clouds, where a single mobile device with intensive sensing task can hire a group of idle neighboring devices. With this new mechanism, time critical sensing tasks can be handled in time with a distributed nature. We prove that the proposed mechanism is truthful, individual rational, budget balance and computational efficient. Our simulation results demonstrate that combinatorial double auction mechanism gets a 26.3% and 15.8% gains in comparison to existing double auction scheme and the centralized maximum matching based algorithm respectively.
△ Less
Submitted 24 October, 2018;
originally announced October 2018.
-
Learning to Segment Object Candidates via Recursive Neural Networks
Authors:
Tianshui Chen,
Liang Lin,
Xian Wu,
Nong Xiao,
Xiaonan Luo
Abstract:
To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component generating a batch of candidate object proposals from images. In this paper, we present a simple yet effective approach for segmenting object proposals via a deep architecture of recursive neural networks (ReNNs), which hierarchically groups regions for de…
▽ More
To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component generating a batch of candidate object proposals from images. In this paper, we present a simple yet effective approach for segmenting object proposals via a deep architecture of recursive neural networks (ReNNs), which hierarchically groups regions for detecting object candidates over scales. Unlike traditional methods that mainly adopt fixed similarity measures for merging regions or finding object proposals, our approach adaptively learns the region merging similarity and the objectness measure during the process of hierarchical region grou**. Specifically, guided by a structured loss, the ReNN model jointly optimizes the cross-region similarity metric with the region merging process as well as the objectness prediction. During inference of the object proposal generation, we introduce randomness into the greedy search to cope with the ambiguity of grou** regions. Extensive experiments on standard benchmarks, e.g., PASCAL VOC and ImageNet, suggest that our approach is capable of producing object proposals with high recall while well preserving the object boundaries and outperforms other existing methods in both accuracy and efficiency.
△ Less
Submitted 28 July, 2018; v1 submitted 3 December, 2016;
originally announced December 2016.
-
Shrinkage estimation of covariance matrix for portfolio choice with high frequency data
Authors:
Cheng Liu,
Ningning Xia,
Jun Yu
Abstract:
This paper examines the usefulness of high frequency data in estimating the covariance matrix for portfolio choice when the portfolio size is large. A computationally convenient nonlinear shrinkage estimator for the integrated covariance (ICV) matrix of financial assets is developed in two steps. The eigenvectors of the ICV are first constructed from a designed time variation adjusted realized cov…
▽ More
This paper examines the usefulness of high frequency data in estimating the covariance matrix for portfolio choice when the portfolio size is large. A computationally convenient nonlinear shrinkage estimator for the integrated covariance (ICV) matrix of financial assets is developed in two steps. The eigenvectors of the ICV are first constructed from a designed time variation adjusted realized covariance matrix of noise-free log-returns of relatively low frequency data. Then the regularized eigenvalues of the ICV are estimated by quasi-maximum likelihood based on high frequency data. The estimator is always positive definite and its inverse is the estimator of the inverse of ICV. It minimizes the limit of the out-of-sample variance of portfolio returns within the class of rotation-equivalent estimators. It works when the number of underlying assets is larger than the number of time series observations in each asset and when the asset price follows a general stochastic process. Our theoretical results are derived under the assumption that the number of assets (p) and the sample size (n) satisfy p/n \to y >0 as n goes to infty . The advantages of our proposed estimator are demonstrated using real data.
△ Less
Submitted 21 November, 2016;
originally announced November 2016.
-
Convergence rate of eigenvector empirical spectral distribution of large Wigner matrices
Authors:
Ningning Xia,
Zhidong Bai
Abstract:
In this paper, we adopt the eigenvector empirical spectral distribution (VESD) to investigate the limiting behavior of eigenvectors of a large dimensional Wigner matrix W_n. In particular, we derive the optimal bound for the rate of convergence of the expected VESD of W_n to the semicircle law, which is of order O(n^{-1/2}) under the assumption of having finite 10th moment. We further show that th…
▽ More
In this paper, we adopt the eigenvector empirical spectral distribution (VESD) to investigate the limiting behavior of eigenvectors of a large dimensional Wigner matrix W_n. In particular, we derive the optimal bound for the rate of convergence of the expected VESD of W_n to the semicircle law, which is of order O(n^{-1/2}) under the assumption of having finite 10th moment. We further show that the convergence rates in probability and almost surely of the VESD are O(n^{-1/4}) and O(n^{-1/6}), respectively, under finite 8th moment condition. Numerical studies demonstrate that the convergence rate does not depend on the choice of unit vector involved in the VESD function, and the best possible bound for the rate of convergence of the VESD is of order O(n^{-1/2}).
△ Less
Submitted 21 November, 2016;
originally announced November 2016.
-
On the inference about the spectral distribution of high-dimensional covariance matrix based on high-frequency noisy observations
Authors:
Ningning Xia,
Xinghua Zheng
Abstract:
In practice, observations are often contaminated by noise, making the resulting sample covariance matrix a signal-plus-noise sample covariance matrix. Aiming to make inferences about the spectral distribution of the population covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (signal) sample covariance matrice…
▽ More
In practice, observations are often contaminated by noise, making the resulting sample covariance matrix a signal-plus-noise sample covariance matrix. Aiming to make inferences about the spectral distribution of the population covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (signal) sample covariance matrices depends on that of signal-plus-noise-type sample covariance matrices. As an application, we consider inferences about the spectral distribution of integrated covolatility (ICV) matrices of high-dimensional diffusion processes based on high-frequency data with microstructure noise. The (slightly modified) pre-averaging estimator is a signal-plus-noise sample covariance matrix, and the aforementioned result, together with a (generalized) connection between the spectral distribution of signal sample covariance matrices and that of the population covariance matrix, enables us to propose a two-step procedure to consistently estimate the spectral distribution of ICV for a class of diffusion processes. An alternative approach is further proposed, which possesses several desirable properties: it is more robust, it eliminates the effects of microstructure noise, and the asymptotic relationship that enables consistent estimation of the spectral distribution of ICV is the standard Marcenko-Pastur equation. The performance of the two approaches is examined via simulation studies under both synchronous and asynchronous observation settings.
△ Less
Submitted 1 March, 2017; v1 submitted 12 April, 2016;
originally announced April 2016.