Search | arXiv e-print repository

Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization

Authors: Xiaoyin Hu, Nachuan Xiao, Xin Liu, Kim-Chuan Toh

Abstract: In this paper, we propose a generalized framework for develo** learning-rate-free momentum stochastic gradient descent (SGD) methods in the minimization of nonsmooth nonconvex functions, especially in training nonsmooth neural networks. Our framework adaptively generates learning rates based on the historical data of stochastic subgradients and iterates. Under mild conditions, we prove that our… ▽ More In this paper, we propose a generalized framework for develo** learning-rate-free momentum stochastic gradient descent (SGD) methods in the minimization of nonsmooth nonconvex functions, especially in training nonsmooth neural networks. Our framework adaptively generates learning rates based on the historical data of stochastic subgradients and iterates. Under mild conditions, we prove that our proposed framework enjoys global convergence to the stationary points of the objective function in the sense of the conservative field, hence providing convergence guarantees for training nonsmooth neural networks. Based on our proposed framework, we propose a novel learning-rate-free momentum SGD method (LFM). Preliminary numerical experiments reveal that LFM performs comparably to the state-of-the-art learning-rate-free methods (which have not been shown theoretically to be convergence) across well-known neural network training benchmarks. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 26 pages

arXiv:2405.06939 [pdf, other]

Tests for principal eigenvalues and eigenvectors

Authors: Jianqing Fan, Yingying Li, Ningning Xia, Xinghua Zheng

Abstract: We establish central limit theorems for principal eigenvalues and eigenvectors under a large factor model setting, and develop two-sample tests of both principal eigenvalues and principal eigenvectors. One important application is to detect structural breaks in large factor models. Compared with existing methods for detecting structural breaks, our tests provide unique insights into the source of… ▽ More We establish central limit theorems for principal eigenvalues and eigenvectors under a large factor model setting, and develop two-sample tests of both principal eigenvalues and principal eigenvectors. One important application is to detect structural breaks in large factor models. Compared with existing methods for detecting structural breaks, our tests provide unique insights into the source of structural breaks because they can distinguish between individual principal eigenvalues and/or eigenvectors. We demonstrate the application by comparing the principal eigenvalues and principal eigenvectors of S\&P500 Index constituents' daily returns over different years. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2404.09438 [pdf, other]

Develo** Lagrangian-based Methods for Nonsmooth Nonconvex Optimization

Authors: Nachuan Xiao, Kuangyu Ding, Xiaoyin Hu, Kim-Chuan Toh

Abstract: In this paper, we consider the minimization of a nonsmooth nonconvex objective function $f(x)$ over a closed convex subset $\mathcal{X}$ of $\mathbb{R}^n$, with additional nonsmooth nonconvex constraints $c(x) = 0$. We develop a unified framework for develo** Lagrangian-based methods, which takes a single-step update to the primal variables by some subgradient methods in each iteration. These su… ▽ More In this paper, we consider the minimization of a nonsmooth nonconvex objective function $f(x)$ over a closed convex subset $\mathcal{X}$ of $\mathbb{R}^n$, with additional nonsmooth nonconvex constraints $c(x) = 0$. We develop a unified framework for develo** Lagrangian-based methods, which takes a single-step update to the primal variables by some subgradient methods in each iteration. These subgradient methods are ``embedded'' into our framework, in the sense that they are incorporated as black-box updates to the primal variables. We prove that our proposed framework inherits the global convergence guarantees from these embedded subgradient methods under mild conditions. In addition, we show that our framework can be extended to solve constrained optimization problems with expectation constraints. Based on the proposed framework, we show that a wide range of existing stochastic subgradient methods, including the proximal SGD, proximal momentum SGD, and proximal ADAM, can be embedded into Lagrangian-based methods. Preliminary numerical experiments on deep learning tasks illustrate that our proposed framework yields efficient variants of Lagrangian-based methods with convergence guarantees for nonconvex nonsmooth constrained optimization problems. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 30 pages, 4 figures

arXiv:2403.11565 [pdf, other]

Decentralized Stochastic Subgradient Methods for Nonsmooth Nonconvex Optimization

Authors: Siyuan Zhang, Nachuan Xiao, Xin Liu

Abstract: In this paper, we concentrate on decentralized optimization problems with nonconvex and nonsmooth objective functions, especially on the decentralized training of nonsmooth neural networks. We introduce a unified framework to analyze the global convergence of decentralized stochastic subgradient-based methods. We prove the global convergence of our proposed framework under mild conditions, by esta… ▽ More In this paper, we concentrate on decentralized optimization problems with nonconvex and nonsmooth objective functions, especially on the decentralized training of nonsmooth neural networks. We introduce a unified framework to analyze the global convergence of decentralized stochastic subgradient-based methods. We prove the global convergence of our proposed framework under mild conditions, by establishing that the generated sequence asymptotically approximates the trajectories of its associated differential inclusion. Furthermore, we establish that our proposed framework covers a wide range of existing efficient decentralized subgradient-based methods, including decentralized stochastic subgradient descent (DSGD), DSGD with gradient-tracking technique (DSGD-T), and DSGD with momentum (DSGD-M). In addition, we introduce the sign map to regularize the update directions in DSGD-M, and show it is enclosed in our proposed framework. Consequently, our convergence results establish, for the first time, global convergence of these methods when applied to nonsmooth nonconvex objectives. Preliminary numerical experiments demonstrate that our proposed framework yields highly efficient decentralized subgradient-based methods with convergence guarantees in the training of nonsmooth neural networks. △ Less

Submitted 27 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 22 pages

arXiv:2402.12743 [pdf]

APT-MMF: An advanced persistent threat actor attribution method based on multimodal and multilevel feature fusion

Authors: Nan Xiao, Bo Lang, Ting Wang, Yikai Chen

Abstract: Threat actor attribution is a crucial defense strategy for combating advanced persistent threats (APTs). Cyber threat intelligence (CTI), which involves analyzing multisource heterogeneous data from APTs, plays an important role in APT actor attribution. The current attribution methods extract features from different CTI perspectives and employ machine learning models to classify CTI reports accor… ▽ More Threat actor attribution is a crucial defense strategy for combating advanced persistent threats (APTs). Cyber threat intelligence (CTI), which involves analyzing multisource heterogeneous data from APTs, plays an important role in APT actor attribution. The current attribution methods extract features from different CTI perspectives and employ machine learning models to classify CTI reports according to their threat actors. However, these methods usually extract only one kind of feature and ignore heterogeneous information, especially the attributes and relations of indicators of compromise (IOCs), which form the core of CTI. To address these problems, we propose an APT actor attribution method based on multimodal and multilevel feature fusion (APT-MMF). First, we leverage a heterogeneous attributed graph to characterize APT reports and their IOC information. Then, we extract and fuse multimodal features, including attribute type features, natural language text features and topological relationship features, to construct comprehensive node representations. Furthermore, we design multilevel heterogeneous graph attention networks to learn the deep hidden features of APT report nodes; these networks integrate IOC type-level, metapath-based neighbor node-level, and metapath semantic-level attention. Utilizing multisource threat intelligence, we construct a heterogeneous attributed graph dataset for verification purposes. The experimental results show that our method not only outperforms the existing methods but also demonstrates its good interpretability for attribution analysis tasks. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.03783 [pdf, other]

Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning

Authors: Fudan Zheng, **dong Cao, Weijiang Yu, Zhiguang Chen, Nong Xiao, Yutong Lu

Abstract: Most advances in medical image recognition supporting clinical auxiliary diagnosis meet challenges due to the low-resource situation in the medical field, where annotations are highly expensive and professional. This low-resource problem can be alleviated by leveraging the transferable representations of large-scale pre-trained vision-language models via relevant medical text prompts. However, exi… ▽ More Most advances in medical image recognition supporting clinical auxiliary diagnosis meet challenges due to the low-resource situation in the medical field, where annotations are highly expensive and professional. This low-resource problem can be alleviated by leveraging the transferable representations of large-scale pre-trained vision-language models via relevant medical text prompts. However, existing pre-trained vision-language models require domain experts to carefully design the medical prompts, which greatly increases the burden on clinicians. To address this problem, we propose a weakly supervised prompt learning method MedPrompt to automatically generate medical prompts, which includes an unsupervised pre-trained vision-language model and a weakly supervised prompt learning model. The unsupervised pre-trained vision-language model utilizes the natural correlation between medical images and corresponding medical texts for pre-training, without any manual annotations. The weakly supervised prompt learning model only utilizes the classes of images in the dataset to guide the learning of the specific class vector in the prompt, while the learning of other context vectors in the prompt requires no manual annotations for guidance. To the best of our knowledge, this is the first model to automatically generate medical prompts. With these prompts, the pre-trained vision-language model can be freed from the strong expert dependency of manual annotation and manual prompt design. Experimental results show that the model using our automatically generated prompts outperforms its full-shot learning hand-crafted prompts counterparts with only a minimal number of labeled samples for few-shot learning, and reaches superior or comparable accuracy on zero-shot image classification. The proposed prompt generator is lightweight and therefore can be embedded into any network architecture. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: Accepted by Pattern Recognition

arXiv:2402.03754 [pdf, other]

Intensive Vision-guided Network for Radiology Report Generation

Authors: Fudan Zheng, Mengfei Li, Ying Wang, Weijiang Yu, Ruixuan Wang, Zhiguang Chen, Nong Xiao, Yutong Lu

Abstract: Automatic radiology report generation is booming due to its huge application potential for the healthcare industry. However, existing computer vision and natural language processing approaches to tackle this problem are limited in two aspects. First, when extracting image features, most of them neglect multi-view reasoning in vision and model single-view structure of medical images, such as space-… ▽ More Automatic radiology report generation is booming due to its huge application potential for the healthcare industry. However, existing computer vision and natural language processing approaches to tackle this problem are limited in two aspects. First, when extracting image features, most of them neglect multi-view reasoning in vision and model single-view structure of medical images, such as space-view or channel-view. However, clinicians rely on multi-view imaging information for comprehensive judgment in daily clinical diagnosis. Second, when generating reports, they overlook context reasoning with multi-modal information and focus on pure textual optimization utilizing retrieval-based methods. We aim to address these two issues by proposing a model that better simulates clinicians' perspectives and generates more accurate reports. Given the above limitation in feature extraction, we propose a Globally-intensive Attention (GIA) module in the medical image encoder to simulate and integrate multi-view vision perception. GIA aims to learn three types of vision perception: depth view, space view, and pixel view. On the other hand, to address the above problem in report generation, we explore how to involve multi-modal signals to generate precisely matched reports, i.e., how to integrate previously predicted words with region-aware visual content in next word prediction. Specifically, we design a Visual Knowledge-guided Decoder (VKGD), which can adaptively consider how much the model needs to rely on visual information and previously predicted text to assist next word prediction. Hence, our final Intensive Vision-guided Network (IVGN) framework includes a GIA-guided Visual Encoder and the VKGD. Experiments on two commonly-used datasets IU X-Ray and MIMIC-CXR demonstrate the superior ability of our method compared with other state-of-the-art approaches. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: Accepted by Physics in Medicine & Biology

arXiv:2402.02149 [pdf, other]

Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance

Authors: Xinyu Peng, Ziyang Zheng, Wenrui Dai, Nuoqian Xiao, Chenglin Li, Junni Zou, Hongkai Xiong

Abstract: Recent diffusion models provide a promising zero-shot solution to noisy linear inverse problems without retraining for specific inverse problems. In this paper, we reveal that recent methods can be uniformly interpreted as employing a Gaussian approximation with hand-crafted isotropic covariance for the intractable denoising posterior to approximate the conditional posterior mean. Inspired by this… ▽ More Recent diffusion models provide a promising zero-shot solution to noisy linear inverse problems without retraining for specific inverse problems. In this paper, we reveal that recent methods can be uniformly interpreted as employing a Gaussian approximation with hand-crafted isotropic covariance for the intractable denoising posterior to approximate the conditional posterior mean. Inspired by this finding, we propose to improve recent methods by using more principled covariance determined by maximum likelihood estimation. To achieve posterior covariance optimization without retraining, we provide general plug-and-play solutions based on two approaches specifically designed for leveraging pre-trained models with and without reverse covariance. We further propose a scalable method for learning posterior covariance prediction based on representation with orthonormal basis. Experimental results demonstrate that the proposed methods significantly enhance reconstruction performance without requiring hyperparameter tuning. △ Less

Submitted 2 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

arXiv:2401.03565 [pdf, other]

An Inexact Preconditioned Zeroth-order Proximal Method for Composite Optimization

Authors: Shanglin Liu, Lei Wang, Nachuan Xiao, Xin Liu

Abstract: In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the differentiable part of the objective function are available. To efficiently solve this composite optimization problem, we propose a preconditioned zeroth-order proximal gra… ▽ More In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the differentiable part of the objective function are available. To efficiently solve this composite optimization problem, we propose a preconditioned zeroth-order proximal gradient method in which the gradients and preconditioners are estimated by finite-difference schemes based on the function values at the same trial points. We establish the global convergence and worst-case complexity for our proposed method. Numerical experiments exhibit the superiority of our developed method. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2312.16046 [pdf, other]

AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts

Authors: Yingpeng Wen, Weijiang Yu, Fudan Zheng, Dan Huang, Nong Xiao

Abstract: Previous post-processing studies on rainfall forecasts using numerical weather prediction (NWP) mainly focus on statistics-based aspects, while learning-based aspects are rarely investigated. Although some manually-designed models are proposed to raise accuracy, they are customized networks, which need to be repeatedly tried and verified, at a huge cost in time and labor. Therefore, a self-supervi… ▽ More Previous post-processing studies on rainfall forecasts using numerical weather prediction (NWP) mainly focus on statistics-based aspects, while learning-based aspects are rarely investigated. Although some manually-designed models are proposed to raise accuracy, they are customized networks, which need to be repeatedly tried and verified, at a huge cost in time and labor. Therefore, a self-supervised neural architecture search (NAS) method without significant manual efforts called AdaNAS is proposed in this study to perform rainfall forecast post-processing and predict rainfall with high accuracy. In addition, we design a rainfall-aware search space to significantly improve forecasts for high-rainfall areas. Furthermore, we propose a rainfall-level regularization function to eliminate the effect of noise data during the training. Validation experiments have been performed under the cases of \emph{None}, \emph{Light}, \emph{Moderate}, \emph{Heavy} and \emph{Violent} on a large-scale precipitation benchmark named TIGGE. Finally, the average mean-absolute error (MAE) and average root-mean-square error (RMSE) of the proposed AdaNAS model are 0.98 and 2.04 mm/day, respectively. Additionally, the proposed AdaNAS model is compared with other neural architecture search methods and previous studies. Compared results reveal the satisfactory performance and superiority of the proposed AdaNAS model in terms of precipitation amount prediction and intensity classification. Concretely, the proposed AdaNAS model outperformed previous best-performing manual methods with MAE and RMSE improving by 80.5\% and 80.3\%, respectively. △ Less

Submitted 4 February, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

arXiv:2311.15636 [pdf]

Liquid-shaped microlens for scalable production of ultrahigh-resolution OCT microendoscope

Authors: Chao Xu, Xin Guan, Syeda Aimen Abbasi, Neng Xia, To Ngai, Li Zhang, Ho-Pui Ho, Sze Hang Calvin Ng, Wu Yuan

Abstract: Endoscopic optical coherence tomography (OCT) is a valuable tool for providing diagnostic images of internal organs and guiding interventions in real time. Miniaturized OCT endoscopes are essential for imaging small and convoluted luminal organs while minimizing invasiveness. However, current methods for fabricating miniature fiber probes have limited ability to correct optical aberrations, leadin… ▽ More Endoscopic optical coherence tomography (OCT) is a valuable tool for providing diagnostic images of internal organs and guiding interventions in real time. Miniaturized OCT endoscopes are essential for imaging small and convoluted luminal organs while minimizing invasiveness. However, current methods for fabricating miniature fiber probes have limited ability to correct optical aberrations, leading to suboptimal imaging performance. In this study, we introduce a new paradigm of liquid sha** technique for the rapid and scalable fabrication of ultrathin and high-performance OCT microendoscopes suitable for minimally invasive clinical applications. This technique enables the flexible customization of freeform microlenses with sub-nanometer optical surface roughness by regulating the minimum energy state of curable optical liquid on a wettability-modified substrate and precisely controlling the liquid volume and physical boundary on a substrate. Using this technique, we simultaneously fabricated 800-nm OCT microendoscopes with a diameter of approximately 0.6 mm and evaluated their ultrahigh-resolution imaging performance in the esophagus of rats and the aorta and brain of mice. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 42 pages, 7 figures in the main text

MSC Class: 78-05

arXiv:2310.08858 [pdf, other]

Adam-family Methods with Decoupled Weight Decay in Deep Learning

Authors: Kuangyu Ding, Nachuan Xiao, Kim-Chuan Toh

Abstract: In this paper, we investigate the convergence properties of a wide class of Adam-family methods for minimizing quadratically regularized nonsmooth nonconvex optimization problems, especially in the context of training nonsmooth neural networks with weight decay. Motivated by the AdamW method, we propose a novel framework for Adam-family methods with decoupled weight decay. Within our framework, th… ▽ More In this paper, we investigate the convergence properties of a wide class of Adam-family methods for minimizing quadratically regularized nonsmooth nonconvex optimization problems, especially in the context of training nonsmooth neural networks with weight decay. Motivated by the AdamW method, we propose a novel framework for Adam-family methods with decoupled weight decay. Within our framework, the estimators for the first-order and second-order moments of stochastic subgradients are updated independently of the weight decay term. Under mild assumptions and with non-diminishing stepsizes for updating the primary optimization variables, we establish the convergence properties of our proposed framework. In addition, we show that our proposed framework encompasses a wide variety of well-known Adam-family methods, hence offering convergence guarantees for these methods in the training of nonsmooth neural networks. More importantly, we show that our proposed framework asymptotically approximates the SGD method, thereby providing an explanation for the empirical observation that decoupled weight decay enhances generalization performance for Adam-family methods. As a practical application of our proposed framework, we propose a novel Adam-family method named Adam with Decoupled Weight Decay (AdamD), and establish its convergence properties under mild conditions. Numerical experiments demonstrate that AdamD outperforms Adam and is comparable to AdamW, in the aspects of both generalization performance and efficiency. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 26 pages

arXiv:2307.10053 [pdf, other]

SGD-type Methods with Guaranteed Global Stability in Nonsmooth Nonconvex Optimization

Authors: Nachuan Xiao, Xiaoyin Hu, Kim-Chuan Toh

Abstract: In this paper, we focus on providing convergence guarantees for variants of the stochastic subgradient descent (SGD) method in minimizing nonsmooth nonconvex functions. We first develop a general framework to establish global stability for general stochastic subgradient methods, where the corresponding differential inclusion admits a coercive Lyapunov function. We prove that, with sufficiently sma… ▽ More In this paper, we focus on providing convergence guarantees for variants of the stochastic subgradient descent (SGD) method in minimizing nonsmooth nonconvex functions. We first develop a general framework to establish global stability for general stochastic subgradient methods, where the corresponding differential inclusion admits a coercive Lyapunov function. We prove that, with sufficiently small stepsizes and controlled noises, the iterates asymptotically stabilize around the stable set of its corresponding differential inclusion. Then we introduce a scheme for develo** SGD-type methods with regularized update directions for the primal variables. Based on our developed framework, we prove the global stability of our proposed scheme under mild conditions. We further illustrate that our scheme yields variants of SGD-type methods, which enjoy guaranteed convergence in training nonsmooth neural networks. In particular, by employing the sign map to regularize the update directions, we propose a novel subgradient method named the Sign-map Regularized SGD method (SRSGD). Preliminary numerical experiments exhibit the high efficiency of SRSGD in training deep neural networks. △ Less

Submitted 13 May, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

Comments: 36 pages

arXiv:2305.17351 [pdf, other]

Disambiguated Lexically Constrained Neural Machine Translation

Authors: **peng Zhang, Nini Xiao, Ke Wang, Chuanqi Dong, Xiangyu Duan, Yuqi Zhang, Min Zhang

Abstract: Lexically constrained neural machine translation (LCNMT), which controls the translation generation with pre-specified constraints, is important in many practical applications. Current approaches to LCNMT typically assume that the pre-specified lexical constraints are contextually appropriate. This assumption limits their application to real-world scenarios where a source lexicon may have multiple… ▽ More Lexically constrained neural machine translation (LCNMT), which controls the translation generation with pre-specified constraints, is important in many practical applications. Current approaches to LCNMT typically assume that the pre-specified lexical constraints are contextually appropriate. This assumption limits their application to real-world scenarios where a source lexicon may have multiple target constraints, and disambiguation is needed to select the most suitable one. In this paper, we propose disambiguated LCNMT (D-LCNMT) to solve the problem. D-LCNMT is a robust and effective two-stage framework that disambiguates the constraints based on contexts at first, then integrates the disambiguated constraints into LCNMT. Experimental results show that our approach outperforms strong baselines including existing data augmentation based approaches on benchmark datasets, and comprehensive experiments in scenarios where a source lexicon corresponds to multiple target constraints demonstrate the constraint disambiguation superiority of our approach. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: Accepted at ACL 2023 as a long paper (Findings), 12 pages, 3 figures

arXiv:2305.03938 [pdf, other]

Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees

Authors: Nachuan Xiao, Xiaoyin Hu, Xin Liu, Kim-Chuan Toh

Abstract: In this paper, we present a comprehensive study on the convergence properties of Adam-family methods for nonsmooth optimization, especially in the training of nonsmooth neural networks. We introduce a novel two-timescale framework that adopts a two-timescale updating scheme, and prove its convergence properties under mild assumptions. Our proposed framework encompasses various popular Adam-family… ▽ More In this paper, we present a comprehensive study on the convergence properties of Adam-family methods for nonsmooth optimization, especially in the training of nonsmooth neural networks. We introduce a novel two-timescale framework that adopts a two-timescale updating scheme, and prove its convergence properties under mild assumptions. Our proposed framework encompasses various popular Adam-family methods, providing convergence guarantees for these methods in training nonsmooth neural networks. Furthermore, we develop stochastic subgradient methods that incorporate gradient clip** techniques for training nonsmooth neural networks with heavy-tailed noise. Through our framework, we show that our proposed methods converge even when the evaluation noises are only assumed to be integrable. Extensive numerical experiments demonstrate the high efficiency and robustness of our proposed methods. △ Less

Submitted 19 February, 2024; v1 submitted 6 May, 2023; originally announced May 2023.

Comments: 53 pages

arXiv:2304.10092 [pdf, ps, other]

A Riemannian Dimension-reduced Second Order Method with Application in Sensor Network Localization

Authors: Tianyun Tang, Kim-Chuan Toh, Nachuan Xiao, Yinyu Ye

Abstract: In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the it… ▽ More In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the iteration complexity of $\mathcal{O}(1/ε^{3/2})$. We apply our method to solve a nonlinear formulation of the wireless sensor network localization problem whose feasible set is a Riemannian manifold that has not been considered in the literature before. Numerical experiments are conducted to verify the high efficiency of our algorithm compared to state-of-the-art Riemannian optimization methods and other nonlinear solvers. △ Less

Submitted 24 April, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

Comments: 19 pages

arXiv:2304.01467 [pdf, ps, other]

A Partial Exact Penalty Function Approach for Constrained Optimization

Authors: Nachuan Xiao, Xin Liu, Kim-Chuan Toh

Abstract: In this paper, we focus on a class of constrained nonlinear optimization problems (NLP), where some of its equality constraints define a closed embedded submanifold $\mathcal{M}$ in $\mathbb{R}^n$. Although NLP can be solved directly by various existing approaches for constrained optimization in Euclidean space, these approaches usually fail to recognize the manifold structure of $\mathcal{M}$. To… ▽ More In this paper, we focus on a class of constrained nonlinear optimization problems (NLP), where some of its equality constraints define a closed embedded submanifold $\mathcal{M}$ in $\mathbb{R}^n$. Although NLP can be solved directly by various existing approaches for constrained optimization in Euclidean space, these approaches usually fail to recognize the manifold structure of $\mathcal{M}$. To achieve better efficiency by utilizing the manifold structure of $\mathcal{M}$ in directly applying these existing optimization approaches, we propose a partial penalty function approach for NLP. In our proposed penalty function approach, we transform NLP into the corresponding constraint dissolving problem (CDP) in the Euclidean space, where the constraints that define $\mathcal{M}$ are eliminated through exact penalization. We establish the relationships on the constraint qualifications between NLP and CDP, and prove that NLP and CDP have the same stationary points and KKT points in a neighborhood of the feasible region under mild conditions. Therefore, various existing optimization approaches developed for constrained optimization in the Euclidean space can be directly applied to solve NLP through CDP. Preliminary numerical experiments demonstrate that by dissolving the constraints that define $\mathcal{M}$, CDP gains superior computational efficiency when compared to directly applying existing optimization approaches to solve NLP, especially in high dimensional scenarios. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: 27 pages

arXiv:2212.02923 [pdf, other]

doi 10.1063/5.0137841

Swimming of the midge larva: principles and tricks of locomotion at intermediate Reynolds number

Authors: Bowen **, Chengfeng Pan, Neng Xia, Jialei Song, Haoxiang Luo, Li Zhang, Yang Ding

Abstract: At the millimeter scale and in the intermediate Reynolds number (Re) regime, the midge and mosquito larvae can reach swimming speeds of more than one body length per cycle performing a "figure-of-8" gait, in which their elongated bodies periodically bend nearly into circles and then fully unfold. To elucidate the propulsion mechanism of this cycle of motion, we conducted a 3D numerical study which… ▽ More At the millimeter scale and in the intermediate Reynolds number (Re) regime, the midge and mosquito larvae can reach swimming speeds of more than one body length per cycle performing a "figure-of-8" gait, in which their elongated bodies periodically bend nearly into circles and then fully unfold. To elucidate the propulsion mechanism of this cycle of motion, we conducted a 3D numerical study which investigates the hydrodynamics of undergoing the prescribed kinematics. Novel propulsion mechanisms, such as modulating the body deformation rate to dynamically increase the maximum net propulsion force, using asymmetric kinematics to generate torque and the appropriate rotation, and controlling the radius of the curled body to manipulate the moment of inertia. The figure-of-8 gait is found to achieve propulsion at a wide range of Re, but is most effective at intermediate Re. The results were further validated experimentally, via the development of a soft millimeter-sized robot that can reach comparable speeds using the figure-of-8 gait. △ Less

Submitted 6 December, 2022; originally announced December 2022.

Comments: 12pages,11 figures, 2 tables

arXiv:2212.02698 [pdf, other]

CDOpt: A Python Package for a Class of Riemannian Optimization

Authors: Nachuan Xiao, Xiaoyin Hu, Xin Liu, Kim-Chuan Toh

Abstract: Optimization over the embedded submanifold defined by constraints $c(x) = 0$ has attracted much interest over the past few decades due to its wide applications in various areas. Plenty of related optimization packages have been developed based on Riemannian optimization approaches, which rely on some basic geometrical materials of Riemannian manifolds, including retractions, vector transports, etc… ▽ More Optimization over the embedded submanifold defined by constraints $c(x) = 0$ has attracted much interest over the past few decades due to its wide applications in various areas. Plenty of related optimization packages have been developed based on Riemannian optimization approaches, which rely on some basic geometrical materials of Riemannian manifolds, including retractions, vector transports, etc. These geometrical materials can be challenging to determine in general. Existing packages only accommodate a few well-known manifolds whose geometrical materials are easily accessible. For other manifolds which are not contained in these packages, the users have to develop the geometric materials by themselves. In addition, it is not always tractable to adopt advanced features from various state-of-the-art unconstrained optimization solvers to Riemannian optimization approaches. We introduce CDOpt (available at https://cdopt.github.io/), a user-friendly Python package for a class Riemannian optimization. Based on constraint dissolving approaches, Riemannian optimization problems are transformed into their equivalent unconstrained counterparts in CDOpt. Therefore, solving Riemannian optimization problems through CDOpt directly benefits from various existing solvers and the rich expertise gained over decades for unconstrained optimization. Moreover, all the computations in CDOpt related to any manifold in question are conducted on its constraints expression, hence users can easily define new manifolds in CDOpt without any background on differential geometry. Furthermore, CDOpt extends the neural layers from PyTorch and Flax, thus allows users to train manifold constrained neural networks directly by the solvers for unconstrained optimization. Extensive numerical experiments demonstrate that CDOpt is highly efficient and robust in solving various classes of Riemannian optimization problems. △ Less

Submitted 28 March, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: 31 pages

arXiv:2211.08987 [pdf, other]

TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task

Authors: Xin Ge, Ke Wang, Jiayi Wang, Nini Xiao, Xiangyu Duan, Yu Zhao, Yuqi Zhang

Abstract: This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). We participate in the English-German and English-Chinese tasks. Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models, which has recently achieved great success. We choose FAIR's WMT19 English… ▽ More This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). We participate in the English-German and English-Chinese tasks. Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models, which has recently achieved great success. We choose FAIR's WMT19 English-German news translation system and MBART50 for English-Chinese as our pre-trained models. Considering the task's condition of limited use of training data, we follow the data augmentation strategies proposed by WeTS to boost our TS model performance. The difference is that we further involve the dual conditional cross-entropy model and GPT-2 language model to filter augmented data. The leader board finally shows that our submissions are ranked first in three of four language directions in the Naive TS task of the WMT22 Translation Suggestion task. △ Less

Submitted 16 November, 2022; originally announced November 2022.

arXiv:2211.08545 [pdf, other]

MapQA: A Dataset for Question Answering on Choropleth Maps

Authors: Shuaichen Chang, David Palzer, Jialin Li, Eric Fosler-Lussier, Ningchuan Xiao

Abstract: Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on huma… ▽ More Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on human-generated charts (ChartQA), such as bar, line, and pie charts. However, little work has paid attention to understanding maps; general VQA models, and ChartQA models, suffer when asked to perform this task. To facilitate and encourage research in this area, we present MapQA, a large-scale dataset of ~800K question-answer pairs over ~60K map images. Our task tests various levels of map understanding, from surface questions about map styles to complex questions that require reasoning on the underlying data. We present the unique challenges of MapQA that frustrate most strong baseline algorithms designed for ChartQA and general VQA tasks. We also present a novel algorithm, Visual Multi-Output Data Extraction based QA (V-MODEQA) for MapQA. V-MODEQA extracts the underlying structured data from a map image with a multi-output model and then performs reasoning on the extracted data. Our experimental results show that V-MODEQA has better overall performance and robustness on MapQA than the state-of-the-art ChartQA and VQA algorithms by capturing the unique properties in map question answering. △ Less

Submitted 15 November, 2022; originally announced November 2022.

arXiv:2208.00732 [pdf, ps, other]

An Improved Unconstrained Approach for Bilevel Optimization

Authors: Xiaoyin Hu, Nachuan Xiao, Xin Liu, Kim-Chuan Toh

Abstract: In this paper, we focus on the nonconvex-strongly-convex bilevel optimization problem (BLO). In this BLO, the objective function of the upper-level problem is nonconvex and possibly nonsmooth, and the lower-level problem is smooth and strongly convex with respect to the underlying variable $y$. We show that the feasible region of BLO is a Riemannian manifold. Then we transform BLO to its correspon… ▽ More In this paper, we focus on the nonconvex-strongly-convex bilevel optimization problem (BLO). In this BLO, the objective function of the upper-level problem is nonconvex and possibly nonsmooth, and the lower-level problem is smooth and strongly convex with respect to the underlying variable $y$. We show that the feasible region of BLO is a Riemannian manifold. Then we transform BLO to its corresponding unconstrained constraint dissolving problem (CDB), whose objective function is explicitly formulated from the objective functions in BLO. We prove that BLO is equivalent to the unconstrained optimization problem CDB. Therefore, various efficient unconstrained approaches, together with their theoretical results, can be directly applied to BLO through CDB. We propose a unified framework for develo** subgradient-based methods for CDB. Remarkably, we show that several existing efficient algorithms can fit the unified framework and be interpreted as descent algorithms for CDB. These examples further demonstrate the great potential of our proposed approach. △ Less

Submitted 23 December, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

Comments: 27 pages, revised version

MSC Class: 15A18; 65F15; 65K05; 90C06

arXiv:2206.04766 [pdf]

Develo** synthetic individual-level population datasets: The case of contextualizing maps of privacy-preserving census data

Authors: Yue Lin, Ningchuan Xiao

Abstract: The purpose of this paper is to describe the development of a synthetic population dataset that is open and realistic and can be used to facilitate understanding the cartographic process and contextualizing the cartographic artifacts. We first discuss an optimization model that is designed to construct the synthetic population by minimizing the difference between the summarized information of the… ▽ More The purpose of this paper is to describe the development of a synthetic population dataset that is open and realistic and can be used to facilitate understanding the cartographic process and contextualizing the cartographic artifacts. We first discuss an optimization model that is designed to construct the synthetic population by minimizing the difference between the summarized information of the synthetic populations and the statistics published in census data tables. We then illustrate how the synthetic population dataset can be used to contextualize maps made using privacy-preserving census data. Two counties in Ohio are used as case studies. △ Less

Submitted 1 April, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: AutoCarto 2022

arXiv:2205.10500 [pdf, other]

A Constraint Dissolving Approach for Nonsmooth Optimization over the Stiefel Manifold

Authors: Xiaoyin Hu, Nachuan Xiao, Xin Liu, Kim-Chuan Toh

Abstract: This paper focus on the minimization of a possibly nonsmooth objective function over the Stiefel manifold. The existing approaches either lack efficiency or can only tackle prox-friendly objective functions. We propose a constraint dissolving function named NCDF and show that it has the same first-order stationary points and local minimizers as the original problem in a neighborhood of the Stiefel… ▽ More This paper focus on the minimization of a possibly nonsmooth objective function over the Stiefel manifold. The existing approaches either lack efficiency or can only tackle prox-friendly objective functions. We propose a constraint dissolving function named NCDF and show that it has the same first-order stationary points and local minimizers as the original problem in a neighborhood of the Stiefel manifold. Furthermore, we show that the Clarke subdifferential of NCDF is easy to achieve from the Clarke subdifferential of the objective function. Therefore, various existing approaches for unconstrained nonsmooth optimization can be directly applied to nonsmooth optimization problems over the Stiefel manifold. We propose a framework for develo** subgradient-based methods and establish their convergence properties based on prior works. Furthermore, based on our proposed framework, we can develop efficient approaches for optimization over the Stiefel manifold. Preliminary numerical experiments further highlight that the proposed constraint dissolving approach yields efficient and direct implementations of various unconstrained approaches to nonsmooth optimization problems over the Stiefel manifold. △ Less

Submitted 20 January, 2023; v1 submitted 21 May, 2022; originally announced May 2022.

Comments: Revised version, 26 pages

arXiv:2205.04686 [pdf, ps, other]

AdMix: A Mixed Sample Data Augmentation Method for Neural Machine Translation

Authors: Chang **, Shigui Qiu, Nini Xiao, Hao Jia

Abstract: In Neural Machine Translation (NMT), data augmentation methods such as back-translation have proven their effectiveness in improving translation performance. In this paper, we propose a novel data augmentation approach for NMT, which is independent of any additional training data. Our approach, AdMix, consists of two parts: 1) introduce faint discrete noise (word replacement, word drop**, word s… ▽ More In Neural Machine Translation (NMT), data augmentation methods such as back-translation have proven their effectiveness in improving translation performance. In this paper, we propose a novel data augmentation approach for NMT, which is independent of any additional training data. Our approach, AdMix, consists of two parts: 1) introduce faint discrete noise (word replacement, word drop**, word swap**) into the original sentence pairs to form augmented samples; 2) generate new synthetic training data by softly mixing the augmented samples with their original samples in training corpus. Experiments on three translation datasets of different scales show that AdMix achieves signifi cant improvements (1.0 to 2.7 BLEU points) over strong Transformer baseline. When combined with other data augmentation techniques (e.g., back-translation), our approach can obtain further improvements. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2203.10319 [pdf, ps, other]

Dissolving Constraints for Riemannian Optimization

Authors: Nachuan Xiao, Xin Liu, Kim-Chuan Toh

Abstract: In this paper, we consider optimization problems over closed embedded submanifolds of $\mathbb{R}^n$, which are defined by the constraints $c(x) = 0$. We propose a class of constraint dissolving approaches for these Riemannian optimization problems. In these proposed approaches, solving a Riemannian optimization problem is transferred into the unconstrained minimization of a constraint dissolving… ▽ More In this paper, we consider optimization problems over closed embedded submanifolds of $\mathbb{R}^n$, which are defined by the constraints $c(x) = 0$. We propose a class of constraint dissolving approaches for these Riemannian optimization problems. In these proposed approaches, solving a Riemannian optimization problem is transferred into the unconstrained minimization of a constraint dissolving function named CDF. Different from existing exact penalty functions, the exact gradient and Hessian of CDF are easy to compute. We study the theoretical properties of CDF and prove that the original problem and CDF have the same first-order and second-order stationary points, local minimizers, and Łojasiewicz exponents in a neighborhood of the feasible region. Remarkably, the convergence properties of our proposed constraint dissolving approaches can be directly inherited from the existing rich results in unconstrained optimization. Therefore, the proposed constraint dissolving approaches build up short cuts from unconstrained optimization to Riemannian optimization. Several illustrative examples further demonstrate the potential of our proposed constraint dissolving approaches. △ Less

Submitted 14 October, 2022; v1 submitted 19 March, 2022; originally announced March 2022.

Comments: 38 pages

arXiv:2110.08986 [pdf, other]

Solving Optimization Problems over the Stiefel Manifold by Smooth Exact Penalty Function

Authors: Nachuan Xiao, Xin Liu

Abstract: In this paper, we present a novel penalty model called ExPen for optimization over the Stiefel manifold. Different from existing penalty functions for orthogonality constraints, ExPen adopts a smooth penalty function without using any first-order derivative of the objective function. We show that all the first-order stationary points of ExPen with a sufficiently large penalty parameter are either… ▽ More In this paper, we present a novel penalty model called ExPen for optimization over the Stiefel manifold. Different from existing penalty functions for orthogonality constraints, ExPen adopts a smooth penalty function without using any first-order derivative of the objective function. We show that all the first-order stationary points of ExPen with a sufficiently large penalty parameter are either feasible, namely, are the first-order stationary points of the original optimization problem, or far from the Stiefel manifold. Besides, the original problem and ExPen share the same second-order stationary points. Remarkably, the exact gradient and Hessian of ExPen are easy to compute. As a consequence, abundant algorithm resources in unconstrained optimization can be applied straightforwardly to solve ExPen. △ Less

Submitted 18 December, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

Comments: revised version, 28 pages

arXiv:2108.03597 [pdf]

Atomic structure, electronic structure and optical absorption of inorganic perovskite compounds Cs2SnI6-nXn (X=F, Cl, Br; n= 0~6): A first-principles study

Authors: Wang Xuan, Tang Yehua, Nairui Xiao, Wang Ke-Fan

Abstract: As a possible alternative to organic-inorganic hybrid perovskite halide, inorganic Cs2SnI6 has drawn more and more research attention recently. In order to find more Cs2SnI6 derivatives as the potential solar cell absorber materials, I- ions in Cs2SnI6 are replaced by other halogen ions and forms the Cs2SnI6-nXn (X=F, Cl, Br; n=1~6) compounds, whose atomic structures, electronic structures and opt… ▽ More As a possible alternative to organic-inorganic hybrid perovskite halide, inorganic Cs2SnI6 has drawn more and more research attention recently. In order to find more Cs2SnI6 derivatives as the potential solar cell absorber materials, I- ions in Cs2SnI6 are replaced by other halogen ions and forms the Cs2SnI6-nXn (X=F, Cl, Br; n=1~6) compounds, whose atomic structures, electronic structures and optical absorption are investigated by first principles calculation. When the alloying level n increases, the mean lattice constants, the weighted Sn-X and Cs-X bond lengths all decreases linearly; the bond length of each Sn-X diminishes slightly inside the octahedral structure; Eg of Cs2SnI6-nXn increases nonlinearly. Eleven Cs2SnI6-nXn compounds have an Eg between 1.0 eV and 2.0 eV and so can be potentially used as the light absorption layer of solar cells. Their partial DOS demonstrate that as the alloying level n increases, I 5p orbital in VBM and CBM is gradually substituted by Br 4p, or Cl 3p, or F 2p orbital. The eleven Cs2SnI6-nXn alloys all have a direct bandgap although the lattice distortion induced by the alloyed X- ion. △ Less

Submitted 8 August, 2021; originally announced August 2021.

Comments: 32 pages(double space), 12 figures, 35 references

arXiv:2108.02365 [pdf]

Hybrid Reasoning Network for Video-based Commonsense Captioning

Authors: Weijiang Yu, Jian Liang, Lei Ji, Lu Li, Yuejian Fang, Nong Xiao, Nan Duan

Abstract: The task of video-based commonsense captioning aims to generate event-wise captions and meanwhile provide multiple commonsense descriptions (e.g., attribute, effect and intention) about the underlying event in the video. Prior works explore the commonsense captions by using separate networks for different commonsense types, which is time-consuming and lacks mining the interaction of different comm… ▽ More The task of video-based commonsense captioning aims to generate event-wise captions and meanwhile provide multiple commonsense descriptions (e.g., attribute, effect and intention) about the underlying event in the video. Prior works explore the commonsense captions by using separate networks for different commonsense types, which is time-consuming and lacks mining the interaction of different commonsense. In this paper, we propose a Hybrid Reasoning Network (HybridNet) to endow the neural networks with the capability of semantic-level reasoning and word-level reasoning. Firstly, we develop multi-commonsense learning for semantic-level reasoning by jointly training different commonsense types in a unified network, which encourages the interaction between the clues of multiple commonsense descriptions, event-wise captions and videos. Then, there are two steps to achieve the word-level reasoning: (1) a memory module records the history predicted sequence from the previous generation processes; (2) a memory-routed multi-head attention (MMHA) module updates the word-level attention maps by incorporating the history information from the memory module into the transformer decoder for word-level reasoning. Moreover, the multimodal features are used to make full use of diverse knowledge for commonsense reasoning. Experiments and abundant analysis on the large-scale Video-to-Commonsense benchmark show that our HybridNet achieves state-of-the-art performance compared with other methods. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: 11 pages, 6 figures

MSC Class: 68T07

arXiv:2106.14049 [pdf]

doi 10.1177/03611981221096117

Identifying High Accuracy Regions in Traffic Camera Images to Enhance the Estimation of Road Traffic Metrics: A Quadtree-Based Method

Authors: Yue Lin, Ningchuan Xiao

Abstract: The growing number of real-time camera feeds in urban areas has made it possible to provide high-quality traffic data for effective transportation planning, operations, and management. However, deriving reliable traffic metrics from these camera feeds has been a challenge due to the limitations of current vehicle detection techniques, as well as the various camera conditions such as height and res… ▽ More The growing number of real-time camera feeds in urban areas has made it possible to provide high-quality traffic data for effective transportation planning, operations, and management. However, deriving reliable traffic metrics from these camera feeds has been a challenge due to the limitations of current vehicle detection techniques, as well as the various camera conditions such as height and resolution. In this work, a quadtree based algorithm is developed to continuously partition the image extent until only regions with high detection accuracy are remained. These regions are referred to as the high-accuracy identification regions (HAIR) in this paper. We demonstrate how the use of the HAIR can improve the accuracy of traffic density estimates using images from traffic cameras at different heights and resolutions in Central Ohio. Our experiments show that the proposed algorithm can be used to derive robust HAIR where vehicle detection accuracy is 41 percent higher than that in the original image extent. The use of the HAIR also significantly improves the traffic density estimation with an overall decrease of 49 percent in root mean squared error. △ Less

Submitted 14 June, 2022; v1 submitted 26 June, 2021; originally announced June 2021.

Comments: Transportation Research Record (2022)

arXiv:2106.03084 [pdf, other]

Combining Static Word Embeddings and Contextual Representations for Bilingual Lexicon Induction

Authors: **peng Zhang, Baijun Ji, Nini Xiao, Xiangyu Duan, Min Zhang, Yangbin Shi, Weihua Luo

Abstract: Bilingual Lexicon Induction (BLI) aims to map words in one language to their translations in another, and is typically through learning linear projections to align monolingual word representation spaces. Two classes of word representations have been explored for BLI: static word embeddings and contextual representations, but there is no studies to combine both. In this paper, we propose a simple y… ▽ More Bilingual Lexicon Induction (BLI) aims to map words in one language to their translations in another, and is typically through learning linear projections to align monolingual word representation spaces. Two classes of word representations have been explored for BLI: static word embeddings and contextual representations, but there is no studies to combine both. In this paper, we propose a simple yet effective mechanism to combine the static word embeddings and the contextual representations to utilize the advantages of both paradigms. We test the combination mechanism on various language pairs under the supervised and unsupervised BLI benchmark settings. Experiments show that our mechanism consistently improves performances over robust BLI baselines on all language pairs by averagely improving 3.2 points in the supervised setting, and 3.1 points in the unsupervised setting. △ Less

Submitted 10 June, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

Comments: Accepted to Findings of ACL2021

arXiv:2105.00381 [pdf, other]

AGMB-Transformer: Anatomy-Guided Multi-Branch Transformer Network for Automated Evaluation of Root Canal Therapy

Authors: Yunxiang Li, Guodong Zeng, Yifan Zhang, Jun Wang, Qianni Zhang, Qun **, Lingling Sun, Qisi Lian, Neng Xia, Ruizi Peng, Kai Tang, Yaqi Wang, Shuai Wang

Abstract: Accurate evaluation of the treatment result on X-ray images is a significant and challenging step in root canal therapy since the incorrect interpretation of the therapy results will hamper timely follow-up which is crucial to the patients' treatment outcome. Nowadays, the evaluation is performed in a manual manner, which is time-consuming, subjective, and error-prone. In this paper, we aim to aut… ▽ More Accurate evaluation of the treatment result on X-ray images is a significant and challenging step in root canal therapy since the incorrect interpretation of the therapy results will hamper timely follow-up which is crucial to the patients' treatment outcome. Nowadays, the evaluation is performed in a manual manner, which is time-consuming, subjective, and error-prone. In this paper, we aim to automate this process by leveraging the advances in computer vision and artificial intelligence, to provide an objective and accurate method for root canal therapy result assessment. A novel anatomy-guided multi-branch Transformer (AGMB-Transformer) network is proposed, which first extracts a set of anatomy features and then uses them to guide a multi-branch Transformer network for evaluation. Specifically, we design a polynomial curve fitting segmentation strategy with the help of landmark detection to extract the anatomy features. Moreover, a branch fusion module and a multi-branch structure including our progressive Transformer and Group Multi-Head Self-Attention (GMHSA) are designed to focus on both global and local features for an accurate diagnosis. To facilitate the research, we have collected a large-scale root canal therapy evaluation dataset with 245 root canal therapy X-ray images, and the experiment results show that our AGMB-Transformer can improve the diagnosis accuracy from 57.96% to 90.20% compared with the baseline network. The proposed AGMB-Transformer can achieve a highly accurate evaluation of root canal therapy. To our best knowledge, our work is the first to perform automatic root canal therapy evaluation and has important clinical value to reduce the workload of endodontists. △ Less

Submitted 28 October, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

Comments: under review

arXiv:2103.13814 [pdf, other]

Dynamic Weighted Learning for Unsupervised Domain Adaptation

Authors: Ni Xiao, Lei Zhang

Abstract: Unsupervised domain adaptation (UDA) aims to improve the classification performance on an unlabeled target domain by leveraging information from a fully labeled source domain. Recent approaches explore domain-invariant and class-discriminant representations to tackle this task. These methods, however, ignore the interaction between domain alignment learning and class discrimination learning. As a… ▽ More Unsupervised domain adaptation (UDA) aims to improve the classification performance on an unlabeled target domain by leveraging information from a fully labeled source domain. Recent approaches explore domain-invariant and class-discriminant representations to tackle this task. These methods, however, ignore the interaction between domain alignment learning and class discrimination learning. As a result, the missing or inadequate tradeoff between domain alignment and class discrimination are prone to the problem of negative transfer. In this paper, we propose Dynamic Weighted Learning (DWL) to avoid the discriminability vanishing problem caused by excessive alignment learning and domain misalignment problem caused by excessive discriminant learning. Technically, DWL dynamically weights the learning losses of alignment and discriminability by introducing the degree of alignment and discriminability. Besides, the problem of sample imbalance across domains is first considered in our work, and we solve the problem by weighing the samples to guarantee information balance across domains. Extensive experiments demonstrate that DWL has an excellent performance in several benchmark datasets. △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: This paper has been accepted by CVPR2021

arXiv:2103.03514 [pdf, ps, other]

A Penalty-free Infeasible Approach for a Class of Nonsmooth Optimization Problems over the Stiefel Manifold

Authors: Nachuan Xiao, Xin Liu, Ya-xiang Yuan

Abstract: Transforming into an exact penalty function model with convex compact constraints yields efficient infeasible approaches for optimization problems with orthogonality constraints. For smooth and $\ell_{2,1}$-norm regularized cases, these infeasible approaches adopt simple and orthonormalization-free updating scheme and show their high efficiency in the test examples. However, to avoid orthonormaliz… ▽ More Transforming into an exact penalty function model with convex compact constraints yields efficient infeasible approaches for optimization problems with orthogonality constraints. For smooth and $\ell_{2,1}$-norm regularized cases, these infeasible approaches adopt simple and orthonormalization-free updating scheme and show their high efficiency in the test examples. However, to avoid orthonormalization while enforcing the feasibility of the final solution, these infeasible approaches introduce a quadratic penalty term, where an inappropriate penalty parameter can lead to numerical inefficiency. Inspired by penalty-free approaches for smooth optimization problems, we proposed a proximal first-order algorithm for a class of optimization problems with orthogonality constraints and nonsmooth regularization term. The consequent algorithm, named sequential linearized proximal gradient method (SLPG), alternatively takes tangential steps and normal steps to improve the optimality and feasibility respectively. In SLPG, the orthonormalization process is invoked only once at the last step if high precision in feasibility is needed, showing that main iterations in SLPG are orthonormalization-free. Besides, both the tangential steps and normal steps do not involve the penalty parameter, and thus SLPG is penalty-free and avoids the inefficiency by inappropriate penalty parameter. We analyze the global convergence properties of SLPG where the tangential steps are inexactly computed. By inexactly computing tangential steps, for smooth cases and $\ell_{2,1}$-norm regularized cases, SLPG has a closed-form updating scheme, which leads to its cheap tangential steps. Numerical experiments illustrate the numerical advantages of SLPG when compared with existing first-order methods. △ Less

Submitted 28 March, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

arXiv:2011.02763 [pdf, other]

Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction

Authors: Xuanzhao Wang, Zheng** Che, Bo Jiang, Ning Xiao, Ke Yang, Jian Tang, Jie** Ye, **gyu Wang, Qi Qi

Abstract: Video anomaly detection is commonly used in many applications such as security surveillance and is very challenging.A majority of recent video anomaly detection approaches utilize deep reconstruction models, but their performance is often suboptimal because of insufficient reconstruction error differences between normal and abnormal video frames in practice. Meanwhile, frame prediction-based anoma… ▽ More Video anomaly detection is commonly used in many applications such as security surveillance and is very challenging.A majority of recent video anomaly detection approaches utilize deep reconstruction models, but their performance is often suboptimal because of insufficient reconstruction error differences between normal and abnormal video frames in practice. Meanwhile, frame prediction-based anomaly detection methods have shown promising performance. In this paper, we propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design which is more in line with the characteristics of surveillance videos. The proposed method is equipped with a multi-path ConvGRU-based frame prediction network that can better handle semantically informative objects and areas of different scales and capture spatial-temporal dependencies in normal videos. A noise tolerance loss is introduced during training to mitigate the interference caused by background noise. Extensive experiments have been conducted on the CUHK Avenue, ShanghaiTech Campus, and UCSD Pedestrian datasets, and the results show that our proposed method outperforms existing state-of-the-art approaches. Remarkably, our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset. △ Less

Submitted 27 May, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

Comments: Paper accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS). Article DOI: 10.1109/TNNLS.2021.3083152

arXiv:2003.08770 [pdf, other]

ElixirNet: Relation-aware Network Architecture Adaptation for Medical Lesion Detection

Authors: Chenhan Jiang, Shaoju Wang, Hang Xu, Xiaodan Liang, Nong Xiao

Abstract: Most advances in medical lesion detection network are limited to subtle modification on the conventional detection network designed for natural images. However, there exists a vast domain gap between medical images and natural images where the medical image detection often suffers from several domain-specific challenges, such as high lesion/background similarity, dominant tiny lesions, and severe… ▽ More Most advances in medical lesion detection network are limited to subtle modification on the conventional detection network designed for natural images. However, there exists a vast domain gap between medical images and natural images where the medical image detection often suffers from several domain-specific challenges, such as high lesion/background similarity, dominant tiny lesions, and severe class imbalance. Is a hand-crafted detection network tailored for natural image undoubtedly good enough over a discrepant medical lesion domain? Is there more powerful operations, filters, and sub-networks that better fit the medical lesion detection problem to be discovered? In this paper, we introduce a novel ElixirNet that includes three components: 1) TruncatedRPN balances positive and negative data for false positive reduction; 2) Auto-lesion Block is automatically customized for medical images to incorporate relation-aware operations among region proposals, and leads to more suitable and efficient classification and localization. 3) Relation transfer module incorporates the semantic relationship and transfers the relevant contextual information with an interpretable the graph thus alleviates the problem of lack of annotations for all types of lesions. Experiments on DeepLesion and Kits19 prove the effectiveness of ElixirNet, achieving improvement of both sensitivity and precision over FPN with fewer parameters. △ Less

Submitted 3 March, 2020; originally announced March 2020.

Comments: 7 pages, 5 figure, AAAI2020

arXiv:2001.10641 [pdf, other]

doi 10.32614/RJ-2020-007

The Rockerverse: Packages and Applications for Containerization with R

Authors: Daniel Nüst, Dirk Eddelbuettel, Dom Bennett, Robrecht Cannoodt, Dav Clark, Gergely Daroczi, Mark Edmondson, Colin Fay, Ellis Hughes, Lars Kjeldgaard, Sean Lopp, Ben Marwick, Heather Nolis, Jacqueline Nolis, Hong Ooi, Karthik Ram, Noam Ross, Lori Shepherd, Péter Sólymos, Tyson Lee Swetnam, Nitesh Turaga, Charlotte Van Petegem, Jason Williams, Craig Willis, Nan Xiao

Abstract: The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-ba… ▽ More The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-based data processing, and production deployment of services. The variety of applications demonstrates the power of the Rocker Project specifically and containerisation in general. Across the diverse ways to use containers, we identified common themes: reproducible environments, scalability and efficiency, and portability across clouds. We conclude that the current growth and diversification of use cases is likely to continue its positive impact, but see the need for consolidating the Rockerverse ecosystem of packages, develo** common practices for applications, and exploring alternative containerisation software. △ Less

Submitted 17 August, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

Comments: Source code for article available at https://github.com/nuest/rockerverse-paper/ Updated version includes some new paragraphs and corrections throughout the text; full diff available at https://github.com/nuest/rockerverse-paper/compare/preprint.v2...preprint.v3

MSC Class: 68N01 ACM Class: D.2.6; D.2.7; K.6.3

Journal ref: The R Journal (2020), 12:1, pages 437-461

arXiv:1911.01543 [pdf, other]

Physics driven reduced order model for real time blood flow simulations

Authors: Sethuraman Sankaran, David Lesage, Rhea Tombropoulos, Nan Xiao, Hyun ** Kim, David Spain, Michiel Schaap, Charles A. Taylor

Abstract: Predictive modeling of blood flow and pressure have numerous applications ranging from non-invasive assessment of functional significance of disease to planning invasive procedures. While several such predictive modeling techniques have been proposed, their use in the clinic has been limited due in part to the significant time required to perform virtual interventions and compute the resultant cha… ▽ More Predictive modeling of blood flow and pressure have numerous applications ranging from non-invasive assessment of functional significance of disease to planning invasive procedures. While several such predictive modeling techniques have been proposed, their use in the clinic has been limited due in part to the significant time required to perform virtual interventions and compute the resultant changes in hemodynamic conditions. We propose a fast hemodynamic assessment method based on first constructing an exploration space of geometries, tailored to each patient, and subsequently building a physics driven reduced order model in this space. We demonstrate that this method can predict fractional flow reserve derived from coronary computed tomography angiography in response to changes to a patient-specific lumen geometry in real time while achieving high accuracy when compared to computational fluid dynamics simulations. We validated this method on over 1300 patients that received a coronary CT scan and demonstrated a correlation coefficient of 0.98 with an error of 0.005 +- 0.015 (95% confidence interval: (-0.020, 0.031)) as compared to three-dimensional blood flow calculations. △ Less

Submitted 4 November, 2019; originally announced November 2019.

arXiv:1910.11475 [pdf, other]

Heterogeneous Graph Learning for Visual Commonsense Reasoning

Authors: Weijiang Yu, **gwen Zhou, Weihao Yu, Xiaodan Liang, Nong Xiao

Abstract: Visual commonsense reasoning task aims at leading the research field into solving cognition-level reasoning with the ability of predicting correct answers and meanwhile providing convincing reasoning paths, resulting in three sub-tasks i.e., Q->A, QA->R and Q->AR. It poses great challenges over the proper semantic alignment between vision and linguistic domains and knowledge reasoning to generate… ▽ More Visual commonsense reasoning task aims at leading the research field into solving cognition-level reasoning with the ability of predicting correct answers and meanwhile providing convincing reasoning paths, resulting in three sub-tasks i.e., Q->A, QA->R and Q->AR. It poses great challenges over the proper semantic alignment between vision and linguistic domains and knowledge reasoning to generate persuasive reasoning paths. Existing works either resort to a powerful end-to-end network that cannot produce interpretable reasoning paths or solely explore intra-relationship of visual objects (homogeneous graph) while ignoring the cross-domain semantic alignment among visual concepts and linguistic words. In this paper, we propose a new Heterogeneous Graph Learning (HGL) framework for seamlessly integrating the intra-graph and inter-graph reasoning in order to bridge vision and language domain. Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement. Moreover, our HGL integrates a contextual voting module to exploit a long-range visual context for better global reasoning. Experiments on the large-scale Visual Commonsense Reasoning benchmark demonstrate the superior performance of our proposed modules on three tasks (improving 5% accuracy on Q->A, 3.5% on QA->R, 5.8% on Q->AR) △ Less

Submitted 24 October, 2019; originally announced October 2019.

Comments: 11 pages, 5 figures

MSC Class: 68T01

arXiv:1910.01923 [pdf, other]

Layout-Graph Reasoning for Fashion Landmark Detection

Authors: Weijiang Yu, Xiaodan Liang, Ke Gong, Chenhan Jiang, Nong Xiao, Liang Lin

Abstract: Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this pa… ▽ More Detecting dense landmarks for diverse clothes, as a fundamental technique for clothes analysis, has attracted increasing research attention due to its huge application potential. However, due to the lack of modeling underlying semantic layout constraints among landmarks, prior works often detect ambiguous and structure-inconsistent landmarks of multiple overlapped clothes in one person. In this paper, we propose to seamlessly enforce structural layout relationships among landmarks on the intermediate representations via multiple stacked layout-graph reasoning layers. We define the layout-graph as a hierarchical structure including a root node, body-part nodes (e.g. upper body, lower body), coarse clothes-part nodes (e.g. collar, sleeve) and leaf landmark nodes (e.g. left-collar, right-collar). Each Layout-Graph Reasoning(LGR) layer aims to map feature representations into structural graph nodes via a Map-to-Node module, performs reasoning over structural graph nodes to achieve global layout coherency via a layout-graph reasoning module, and then maps graph nodes back to enhance feature representations via a Node-to-Map module. The layout-graph reasoning module integrates a graph clustering operation to generate representations of intermediate nodes (bottom-up inference) and then a graph deconvolution operation (top-down inference) over the whole graph. Extensive experiments on two public fashion landmark datasets demonstrate the superiority of our model. Furthermore, to advance the fine-grained fashion landmark research for supporting more comprehensive clothes generation and attribute recognition, we contribute the first Fine-grained Fashion Landmark Dataset (FFLD) containing 200k images annotated with at most 32 key-points for 13 clothes types. △ Less

Submitted 4 October, 2019; originally announced October 2019.

Comments: 9 pages, 5 figures, CVPR2019

MSC Class: I.4.9

arXiv:1909.09677 [pdf, other]

doi 10.1145/3343031.3350883

Gradual Network for Single Image De-raining

Authors: Zhe Huang, Weijiang Yu, Wayne Zhang, Litong Feng, Nong Xiao

Abstract: Most advances in single image de-raining meet a key challenge, which is removing rain streaks with different scales and shapes while preserving image details. Existing single image de-raining approaches treat rain-streak removal as a process of pixel-wise regression directly. However, they are lacking in mining the balance between over-de-raining (e.g. removing texture details in rain-free regions… ▽ More Most advances in single image de-raining meet a key challenge, which is removing rain streaks with different scales and shapes while preserving image details. Existing single image de-raining approaches treat rain-streak removal as a process of pixel-wise regression directly. However, they are lacking in mining the balance between over-de-raining (e.g. removing texture details in rain-free regions) and under-de-raining (e.g. leaving rain streaks). In this paper, we firstly propose a coarse-to-fine network called Gradual Network (GraNet) consisting of coarse stage and fine stage for delving into single image de-raining with different granularities. Specifically, to reveal coarse-grained rain-streak characteristics (e.g. long and thick rain streaks/raindrops), we propose a coarse stage by utilizing local-global spatial dependencies via a local-global subnetwork composed of region-aware blocks. Taking the residual result (the coarse de-rained result) between the rainy image sample (i.e. the input data) and the output of coarse stage (i.e. the learnt rain mask) as input, the fine stage continues to de-rain by removing the fine-grained rain streaks (e.g. light rain streaks and water mist) to get a rain-free and well-reconstructed output image via a unified contextual merging sub-network with dense blocks and a merging block. Solid and comprehensive experiments on synthetic and real data demonstrate that our GraNet can significantly outperform the state-of-the-art methods by removing rain streaks with various densities, scales and shapes while kee** the image details of rain-free regions well-preserved. △ Less

Submitted 20 September, 2019; originally announced September 2019.

Comments: In Proceedings of the 27th ACM International Conference on Multimedia (MM 2019)

arXiv:1908.08670 [pdf, other]

On the estimation of high-dimensional integrated covariance matrix based on high-frequency data with multiple transactions

Authors: Moming Wang, Ningning Xia, You Zhou

Abstract: Due to the mechanism of recording, the presence of multiple transactions at each recording time becomes a common feature for high-frequency data in financial market. Using random matrix theory, this paper considers the estimation of integrated covariance (ICV) matrices of high-dimensional diffusion processes based on multiple high-frequency observations. We start by studying the estimator, the tim… ▽ More Due to the mechanism of recording, the presence of multiple transactions at each recording time becomes a common feature for high-frequency data in financial market. Using random matrix theory, this paper considers the estimation of integrated covariance (ICV) matrices of high-dimensional diffusion processes based on multiple high-frequency observations. We start by studying the estimator, the time-variation adjusted realized covariance (TVA) matrix, proposed in Zheng and Li (2011) without microstructure noise. We show that in the high-dimensional case, for a class C of diffusion processes, the limiting spectral distribution (LSD) of averaged TVA depends not only on that of ICV, but also on the numbers of multiple transactions at each recording time. However, in practice, the observed prices are always contaminated by the market microstructure noise. Thus the limiting behavior of pre-averaging averaged TVA matrices is studied based on the noisy multiple observations. We show that for processes in class C, the pre-averaging averaged TVA has desirable properties that it eliminates the effects of microstructure noise and multiple transactions, and its LSD depends solely on that of the ICV matrix. Further, three types of nonlinear shrinkage estimators of ICV are proposed based on high-frequency noisy multiple observations. Simulation studies support our theoretical results and show the finite sample performance of the proposed estimators. At last, the high-frequency portfolio strategies are evaluated under these estimators in real data analysis. △ Less

Submitted 5 September, 2019; v1 submitted 23 August, 2019; originally announced August 2019.

arXiv:1904.09824 [pdf, other]

Judging Chemical Reaction Practicality From Positive Sample only Learning

Authors: Shu Jiang, Zhuosheng Zhang, Hai Zhao, Jiangtong Li, Yang Yang, Bao-Liang Lu, Ning Xia

Abstract: Chemical reaction practicality is the core task among all symbol intelligence based chemical information processing, for example, it provides indispensable clue for further automatic synthesis route inference. Considering that chemical reactions have been represented in a language form, we propose a new solution to generally judge the practicality of organic reaction without considering complex qu… ▽ More Chemical reaction practicality is the core task among all symbol intelligence based chemical information processing, for example, it provides indispensable clue for further automatic synthesis route inference. Considering that chemical reactions have been represented in a language form, we propose a new solution to generally judge the practicality of organic reaction without considering complex quantum physical modeling or chemistry knowledge. While tackling the practicality judgment as a machine learning task from positive and negative (chemical reaction) samples, all existing studies have to carefully handle the serious insufficiency issue on the negative samples. We propose an auto-construction method to well solve the extensively existed long-term difficulty. Experimental results show our model can effectively predict the practicality of chemical reactions, which achieves a high accuracy of 99.76\% on real large-scale chemical lab reaction practicality judgment. △ Less

Submitted 22 April, 2019; originally announced April 2019.

arXiv:1810.12829 [pdf, other]

doi 10.1109/TIP.2018.2878956

Cross-Modal Attentional Context Learning for RGB-D Object Detection

Authors: Guanbin Li, Yukang Gan, Hejun Wu, Nong Xiao, Liang Lin

Abstract: Recognizing objects from simultaneously sensed photometric (RGB) and depth channels is a fundamental yet practical problem in many machine vision applications such as robot gras** and autonomous driving. In this paper, we address this problem by develo** a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and… ▽ More Recognizing objects from simultaneously sensed photometric (RGB) and depth channels is a fundamental yet practical problem in many machine vision applications such as robot gras** and autonomous driving. In this paper, we address this problem by develo** a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data. Compared to existing RGB-D object detection frameworks, our approach has several appealing properties. First, it consists of an attention-based global context model for exploiting adaptive contextual information and incorporating this information into a region-based CNN (e.g., Fast RCNN) framework to achieve improved object detection performance. Second, our CMAC framework further contains a fine-grained object part attention module to harness multiple discriminative object parts inside each possible object region for superior local feature representation. While greatly improving the accuracy of RGB-D object detection, the effective cross-modal information fusion as well as attentional context modeling in our proposed model provide an interpretable visualization scheme. Experimental results demonstrate that the proposed method significantly improves upon the state of the art on all public benchmarks. △ Less

Submitted 30 October, 2018; originally announced October 2018.

Comments: Accept as a regular paper to IEEE Transactions on Image Processing

arXiv:1810.11707 [pdf]

From Communication to Sensing : Recognizing and Counting Repetitive Motions with Wireless Backscattering

Authors: Ning Xiao, Panlong Yang, Yubo Yan, Hao Zhou, Xiang-Yang Li, Haohua Du

Abstract: Recently several ground-breaking RF-based motion recognition systems were proposed to detect and/or recognize macro/micro human movements. These systems often suffer from various interferences caused by multiple-users moving simultaneously, resulting in extremely low recognition accuracy. To tackle this challenge, we propose a novel system, called Motion-Fi, which marries battery free wireless bac… ▽ More Recently several ground-breaking RF-based motion recognition systems were proposed to detect and/or recognize macro/micro human movements. These systems often suffer from various interferences caused by multiple-users moving simultaneously, resulting in extremely low recognition accuracy. To tackle this challenge, we propose a novel system, called Motion-Fi, which marries battery free wireless backscattering and device-free sensing. Motion-Fi is an accurate, interference tolerable motion-recognition system, which counts repetitive motions without using scenario-dependent templates or profiles and enables multi-users performing certain motions simultaneously because of the relatively short transmission range of backscattered signals. Although the repetitive motions are fairly well detectable through the backscattering signals in theory, in reality they get blended into various other system noises during the motion. Moreover, irregular motion patterns among users will lead to expensive computation cost for motion recognition. We build a backscattering wireless platform to validate our design in various scenarios for over 6 months when different persons, distances and orientations are incorporated. In our experiments, the periodicity in motions could be recognized without any learning or training process, and the accuracy of counting such motions can be achieved within 5% count error. With little efforts in learning the patterns, our method could achieve 93.1% motion-recognition accuracy for a variety of motions. Moreover, by leveraging the periodicity of motions, the recognition accuracy could be further improved to nearly 100% with only 3 repetitions. Our experiments also show that the motions of multiple persons separating by around 2 meters cause little accuracy reduction in the counting process. △ Less

Submitted 27 October, 2018; originally announced October 2018.

arXiv:1810.10697 [pdf]

COUSTIC: Combinatorial Double auction for Task Assignment in Device-to-Device Clouds

Authors: Yutong Zhai, Liusheng Huang, Long Chen, Ning Xiao, Yangyang Geng

Abstract: With the emerging technologies of Internet of Things (IOTs), the capabilities of mobile devices have increased tremendously. However, in the big data era, to complete tasks on one device is still challenging. As an emerging technology, crowdsourcing utilizing crowds of devices to facilitate large scale sensing tasks has gaining more and more research attention. Most of existing works either assume… ▽ More With the emerging technologies of Internet of Things (IOTs), the capabilities of mobile devices have increased tremendously. However, in the big data era, to complete tasks on one device is still challenging. As an emerging technology, crowdsourcing utilizing crowds of devices to facilitate large scale sensing tasks has gaining more and more research attention. Most of existing works either assume devices are willing to cooperate utilizing centralized mechanisms or design incentive algorithms using double auctions. Which is not practical to deal with the case when there is a lack of centralized controller for the former, and not suitable to the case when the seller device is also resource constrained for the later. In this paper, we propose a truthful incentive mechanism with combinatorial double auction for crowd sensing task assignment in device-to-device (D2D) clouds, where a single mobile device with intensive sensing task can hire a group of idle neighboring devices. With this new mechanism, time critical sensing tasks can be handled in time with a distributed nature. We prove that the proposed mechanism is truthful, individual rational, budget balance and computational efficient. Our simulation results demonstrate that combinatorial double auction mechanism gets a 26.3% and 15.8% gains in comparison to existing double auction scheme and the centralized maximum matching based algorithm respectively. △ Less

Submitted 24 October, 2018; originally announced October 2018.

Comments: 17 pages, 7 figures, Accepted by 18th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP 2018)

arXiv:1612.01057 [pdf, other]

Learning to Segment Object Candidates via Recursive Neural Networks

Authors: Tianshui Chen, Liang Lin, Xian Wu, Nong Xiao, Xiaonan Luo

Abstract: To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component generating a batch of candidate object proposals from images. In this paper, we present a simple yet effective approach for segmenting object proposals via a deep architecture of recursive neural networks (ReNNs), which hierarchically groups regions for de… ▽ More To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component generating a batch of candidate object proposals from images. In this paper, we present a simple yet effective approach for segmenting object proposals via a deep architecture of recursive neural networks (ReNNs), which hierarchically groups regions for detecting object candidates over scales. Unlike traditional methods that mainly adopt fixed similarity measures for merging regions or finding object proposals, our approach adaptively learns the region merging similarity and the objectness measure during the process of hierarchical region grou**. Specifically, guided by a structured loss, the ReNN model jointly optimizes the cross-region similarity metric with the region merging process as well as the objectness prediction. During inference of the object proposal generation, we introduce randomness into the greedy search to cope with the ambiguity of grou** regions. Extensive experiments on standard benchmarks, e.g., PASCAL VOC and ImageNet, suggest that our approach is capable of producing object proposals with high recall while well preserving the object boundaries and outperforms other existing methods in both accuracy and efficiency. △ Less

Submitted 28 July, 2018; v1 submitted 3 December, 2016; originally announced December 2016.

Comments: Accepted at TIP

arXiv:1611.06753 [pdf, ps, other]

Shrinkage estimation of covariance matrix for portfolio choice with high frequency data

Authors: Cheng Liu, Ningning Xia, Jun Yu

Abstract: This paper examines the usefulness of high frequency data in estimating the covariance matrix for portfolio choice when the portfolio size is large. A computationally convenient nonlinear shrinkage estimator for the integrated covariance (ICV) matrix of financial assets is developed in two steps. The eigenvectors of the ICV are first constructed from a designed time variation adjusted realized cov… ▽ More This paper examines the usefulness of high frequency data in estimating the covariance matrix for portfolio choice when the portfolio size is large. A computationally convenient nonlinear shrinkage estimator for the integrated covariance (ICV) matrix of financial assets is developed in two steps. The eigenvectors of the ICV are first constructed from a designed time variation adjusted realized covariance matrix of noise-free log-returns of relatively low frequency data. Then the regularized eigenvalues of the ICV are estimated by quasi-maximum likelihood based on high frequency data. The estimator is always positive definite and its inverse is the estimator of the inverse of ICV. It minimizes the limit of the out-of-sample variance of portfolio returns within the class of rotation-equivalent estimators. It works when the number of underlying assets is larger than the number of time series observations in each asset and when the asset price follows a general stochastic process. Our theoretical results are derived under the assumption that the number of assets (p) and the sample size (n) satisfy p/n \to y >0 as n goes to infty . The advantages of our proposed estimator are demonstrated using real data. △ Less

Submitted 21 November, 2016; originally announced November 2016.

arXiv:1611.06744 [pdf, other]

Convergence rate of eigenvector empirical spectral distribution of large Wigner matrices

Authors: Ningning Xia, Zhidong Bai

Abstract: In this paper, we adopt the eigenvector empirical spectral distribution (VESD) to investigate the limiting behavior of eigenvectors of a large dimensional Wigner matrix W_n. In particular, we derive the optimal bound for the rate of convergence of the expected VESD of W_n to the semicircle law, which is of order O(n^{-1/2}) under the assumption of having finite 10th moment. We further show that th… ▽ More In this paper, we adopt the eigenvector empirical spectral distribution (VESD) to investigate the limiting behavior of eigenvectors of a large dimensional Wigner matrix W_n. In particular, we derive the optimal bound for the rate of convergence of the expected VESD of W_n to the semicircle law, which is of order O(n^{-1/2}) under the assumption of having finite 10th moment. We further show that the convergence rates in probability and almost surely of the VESD are O(n^{-1/4}) and O(n^{-1/6}), respectively, under finite 8th moment condition. Numerical studies demonstrate that the convergence rate does not depend on the choice of unit vector involved in the VESD function, and the best possible bound for the rate of convergence of the VESD is of order O(n^{-1/2}). △ Less

Submitted 21 November, 2016; originally announced November 2016.

arXiv:1604.03638 [pdf, other]

On the inference about the spectral distribution of high-dimensional covariance matrix based on high-frequency noisy observations

Authors: Ningning Xia, Xinghua Zheng

Abstract: In practice, observations are often contaminated by noise, making the resulting sample covariance matrix a signal-plus-noise sample covariance matrix. Aiming to make inferences about the spectral distribution of the population covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (signal) sample covariance matrice… ▽ More In practice, observations are often contaminated by noise, making the resulting sample covariance matrix a signal-plus-noise sample covariance matrix. Aiming to make inferences about the spectral distribution of the population covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (signal) sample covariance matrices depends on that of signal-plus-noise-type sample covariance matrices. As an application, we consider inferences about the spectral distribution of integrated covolatility (ICV) matrices of high-dimensional diffusion processes based on high-frequency data with microstructure noise. The (slightly modified) pre-averaging estimator is a signal-plus-noise sample covariance matrix, and the aforementioned result, together with a (generalized) connection between the spectral distribution of signal sample covariance matrices and that of the population covariance matrix, enables us to propose a two-step procedure to consistently estimate the spectral distribution of ICV for a class of diffusion processes. An alternative approach is further proposed, which possesses several desirable properties: it is more robust, it eliminates the effects of microstructure noise, and the asymptotic relationship that enables consistent estimation of the spectral distribution of ICV is the standard Marcenko-Pastur equation. The performance of the two approaches is examined via simulation studies under both synchronous and asynchronous observation settings. △ Less

Submitted 1 March, 2017; v1 submitted 12 April, 2016; originally announced April 2016.

Comments: arXiv admin note: text overlap with arXiv:1409.2121

Showing 1–50 of 60 results for author: Xiao, N