-
Flavor mixing and CP violation from the interplay of $S_4$ modular group and gCP
Authors:
Bu-Yao Qu,
Xiang-Gan Liu,
**-Tao Chen,
Gui-Jun Ding
Abstract:
We have performed a systematical analysis of lepton and quark masses models based on $Γ_4\cong S_4$ modular symmetry with gCP symmetry. We have considered both cases that neutrinos are Majorana particles and Dirac particles. All possible nontrivial representation assignments of matter fields are considered, and the most general form of fermion mass matrices are given. The phenomenologically viable…
▽ More
We have performed a systematical analysis of lepton and quark masses models based on $Γ_4\cong S_4$ modular symmetry with gCP symmetry. We have considered both cases that neutrinos are Majorana particles and Dirac particles. All possible nontrivial representation assignments of matter fields are considered, and the most general form of fermion mass matrices are given. The phenomenologically viable models with the lowest number of free parameters together with the results of fit are presented. We find out nine lepton models with seven real free parameters including the real and imaginary parts of modulus for Majorana neutrinos, which can accommodate the lepton masses and neutrino oscillation data. The prediction for leptogenesis is studied in an example lepton model. The observed baryon asymmetry as well as lepton masses and mixing angles can be explained. For Dirac neutrinos, four lepton models with five real free couplings are compatible with experimental data. Ten quark models containing seven couplings are found to be able to accommodate the hierarchical quark masses and mixing angles and CP violation phase. Furthermore, the $S_4$ modular symmetry can provide a unified description of lepton and quark flavor structure, and a benchmark model is presented.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Single-atom verification of the noise-resilient and fast characteristics of universal nonadiabatic noncyclic geometric quantum gates
Authors:
J. W. Zhang,
L. -L. Yan,
J. C. Li,
G. Y. Ding,
J. T. Bu,
L. Chen,
S. -L. Su,
F. Zhou,
M. Feng
Abstract:
Quantum gates induced by geometric phases are intrinsically robust against noise due to their global properties of the evolution paths. Compared to conventional nonadiabatic geometric quantum computation (NGQC), the recently proposed nonadiabatic noncyclic geometric quantum computation (NNGQC) works in a faster fashion, while still remaining the robust feature of the geometric operations. Here, we…
▽ More
Quantum gates induced by geometric phases are intrinsically robust against noise due to their global properties of the evolution paths. Compared to conventional nonadiabatic geometric quantum computation (NGQC), the recently proposed nonadiabatic noncyclic geometric quantum computation (NNGQC) works in a faster fashion, while still remaining the robust feature of the geometric operations. Here, we experimentally implement the NNGQC in a single trapped ultracold $^{40}$Ca$^{+}$ ion for verifying the noise-resilient and fast feature. By performing unitary operations under imperfect conditions, we witness the advantages of the NNGQC with measured fidelities by quantum process tomography in comparison with other two quantum gates by conventional NGQC and by straightforwardly dynamical evolution. Our results provide the first evidence confirming the possibility of accelerated quantum information processing with limited systematic errors even in the imperfect situation.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Anomaly Detection in Video Sequences: A Benchmark and Computational Model
Authors:
Boyang Wan,
Wenhui Jiang,
Yuming Fang,
Zhiyuan Luo,
Guanqun Ding
Abstract:
Anomaly detection has attracted considerable search attention. However, existing anomaly detection databases encounter two major problems. Firstly, they are limited in scale. Secondly, training sets contain only video-level labels indicating the existence of an abnormal event during the full video while lacking annotations of precise time durations. To tackle these problems, we contribute a new La…
▽ More
Anomaly detection has attracted considerable search attention. However, existing anomaly detection databases encounter two major problems. Firstly, they are limited in scale. Secondly, training sets contain only video-level labels indicating the existence of an abnormal event during the full video while lacking annotations of precise time durations. To tackle these problems, we contribute a new Large-scale Anomaly Detection (LAD) database as the benchmark for anomaly detection in video sequences, which is featured in two aspects. 1) It contains 2000 video sequences including normal and abnormal video clips with 14 anomaly categories including crash, fire, violence, etc. with large scene varieties, making it the largest anomaly analysis database to date. 2) It provides the annotation data, including video-level labels (abnormal/normal video, anomaly type) and frame-level labels (abnormal/normal video frame) to facilitate anomaly detection. Leveraging the above benefits from the LAD database, we further formulate anomaly detection as a fully-supervised learning problem and propose a multi-task deep neural network to solve it. We first obtain the local spatiotemporal contextual feature by using an Inflated 3D convolutional (I3D) network. Then we construct a recurrent convolutional neural network fed the local spatiotemporal contextual feature to extract the spatiotemporal contextual feature. With the global spatiotemporal contextual feature, the anomaly type and score can be computed simultaneously by a multi-task neural network. Experimental results show that the proposed method outperforms the state-of-the-art anomaly detection methods on our database and other public databases of anomaly detection. Codes are available at https://github.com/wanboyang/anomaly_detection_LAD2000.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
Dual-Modality Vehicle Anomaly Detection via Bilateral Trajectory Tracing
Authors:
**gyuan Chen,
Guanchen Ding,
Yuchen Yang,
Wenwei Han,
Kangmin Xu,
Tianyi Gao,
Zhe Zhang,
Wan** Ouyang,
Hao Cai,
Zhenzhong Chen
Abstract:
Traffic anomaly detection has played a crucial role in Intelligent Transportation System (ITS). The main challenges of this task lie in the highly diversified anomaly scenes and variational lighting conditions. Although much work has managed to identify the anomaly in homogenous weather and scene, few resolved to cope with complex ones. In this paper, we proposed a dual-modality modularized method…
▽ More
Traffic anomaly detection has played a crucial role in Intelligent Transportation System (ITS). The main challenges of this task lie in the highly diversified anomaly scenes and variational lighting conditions. Although much work has managed to identify the anomaly in homogenous weather and scene, few resolved to cope with complex ones. In this paper, we proposed a dual-modality modularized methodology for the robust detection of abnormal vehicles. We introduced an integrated anomaly detection framework comprising the following modules: background modeling, vehicle tracking with detection, mask construction, Region of Interest (ROI) backtracking, and dual-modality tracing. Concretely, we employed background modeling to filter the motion information and left the static information for later vehicle detection. For the vehicle detection and tracking module, we adopted YOLOv5 and multi-scale tracking to localize the anomalies. Besides, we utilized the frame difference and tracking results to identify the road and obtain the mask. In addition, we introduced multiple similarity estimation metrics to refine the anomaly period via backtracking. Finally, we proposed a dual-modality bilateral tracing module to refine the time further. The experiments conducted on the Track 4 testset of the NVIDIA 2021 AI City Challenge yielded a result of 0.9302 F1-Score and 3.4039 root mean square error (RMSE), indicating the effectiveness of our framework.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
Authors:
Xiaohan Ding,
Chunlong Xia,
Xiangyu Zhang,
Xiaojie Chu,
Jungong Han,
Guiguang Ding
Abstract:
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers. Compared to convolutional layers, FC layers are more efficient, better at modeling the long-range dependencies and positional patterns, but worse at capturing the local structures, hence usually less favored for image recognition. We p…
▽ More
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers. Compared to convolutional layers, FC layers are more efficient, better at modeling the long-range dependencies and positional patterns, but worse at capturing the local structures, hence usually less favored for image recognition. We propose a structural re-parameterization technique that adds local prior into an FC to make it powerful for image recognition. Specifically, we construct convolutional layers inside a RepMLP during training and merge them into the FC for inference. On CIFAR, a simple pure-MLP model shows performance very close to CNN. By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs. Our intriguing findings highlight that combining the global representational capacity and positional perception of FC with the local prior of convolution can improve the performance of neural network with faster speed on both the tasks with translation invariance (e.g., semantic segmentation) and those with aligned images and positional patterns (e.g., face recognition). The code and models are available at https://github.com/DingXiaoH/RepMLP.
△ Less
Submitted 30 March, 2022; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Modular $S_4\times SU(5)$ GUT
Authors:
Gui-Jun Ding,
Stephen F. King,
Chang-Yuan Yao
Abstract:
Modular symmetry offers the possibility to provide an origin of discrete flavour symmetry and to break it along particular symmetry preserving directions without introducing flavons or driving fields. It is also possible to use a weighton field to account for charged fermion mass hierarchies rather than a Froggatt-Nielsen mechanism. Such an approach can be applied to flavoured Grand Unified Theori…
▽ More
Modular symmetry offers the possibility to provide an origin of discrete flavour symmetry and to break it along particular symmetry preserving directions without introducing flavons or driving fields. It is also possible to use a weighton field to account for charged fermion mass hierarchies rather than a Froggatt-Nielsen mechanism. Such an approach can be applied to flavoured Grand Unified Theories (GUTs) which can be greatly simplified using modular forms. As an example, we consider a modular version of a previously proposed $S_4\times SU(5)$ GUT, with Gatto-Sartori-Tonin and Georgi-Jarlskog relations, in which all flavons and driving fields are removed, with their effect replaced by modular forms with moduli assumed to be at various fixed points, rendering the theory much simpler. In the neutrino sector there are two right-handed neutrinos constituting a Littlest Seesaw model satisfying Constrained Sequential Dominance (CSD) where the two columns of the Dirac neutrino mass matrix are proportional to $(0,1, -1)$ and $(1, n, 2-n)$ respectively, and $n=1+\sqrt{6}\approx 3.45$ is prescribed by the modular symmetry, with predictions subject to charged lepton mixing corrections. We perform a numerical analysis, showing quark and lepton mass and mixing correlations around the best fit points.
△ Less
Submitted 29 September, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
Diverse Branch Block: Building a Convolution as an Inception-like Unit
Authors:
Xiaohan Ding,
Xiangyu Zhang,
Jungong Han,
Guiguang Ding
Abstract:
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. The block is named Diverse Branch Block (DBB), which enhances the representational capacity of a single convolution by combining diverse branches of different scales and complexities to enrich the feature space, including sequences of convolutions, multi-scal…
▽ More
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. The block is named Diverse Branch Block (DBB), which enhances the representational capacity of a single convolution by combining diverse branches of different scales and complexities to enrich the feature space, including sequences of convolutions, multi-scale convolutions, and average pooling. After training, a DBB can be equivalently converted into a single conv layer for deployment. Unlike the advancements of novel ConvNet architectures, DBB complicates the training-time microstructure while maintaining the macro architecture, so that it can be used as a drop-in replacement for regular conv layers of any architecture. In this way, the model can be trained to reach a higher level of performance and then transformed into the original inference-time structure for inference. DBB improves ConvNets on image classification (up to 1.9% higher top-1 accuracy on ImageNet), object detection and semantic segmentation. The PyTorch code and models are released at https://github.com/DingXiaoH/DiverseBranchBlock.
△ Less
Submitted 29 March, 2021; v1 submitted 24 March, 2021;
originally announced March 2021.
-
Computational Emotion Analysis From Images: Recent Advances and Future Directions
Authors:
Sicheng Zhao,
Quanwei Huang,
Youbao Tang,
Xingxu Yao,
Jufeng Yang,
Guiguang Ding,
Björn W. Schuller
Abstract:
Emotions are usually evoked in humans by images. Recently, extensive research efforts have been dedicated to understanding the emotions of images. In this chapter, we aim to introduce image emotion analysis (IEA) from a computational perspective with the focus on summarizing recent advances and suggesting future directions. We begin with commonly used emotion representation models from psychology.…
▽ More
Emotions are usually evoked in humans by images. Recently, extensive research efforts have been dedicated to understanding the emotions of images. In this chapter, we aim to introduce image emotion analysis (IEA) from a computational perspective with the focus on summarizing recent advances and suggesting future directions. We begin with commonly used emotion representation models from psychology. We then define the key computational problems that the researchers have been trying to solve and provide supervised frameworks that are generally used for different IEA tasks. After the introduction of major challenges in IEA, we present some representative methods on emotion feature extraction, supervised classifier learning, and domain adaptation. Furthermore, we introduce available datasets for evaluation and summarize some main results. Finally, we discuss some open questions and future directions that researchers can pursue.
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
A Data-Centric Framework for Composable NLP Workflows
Authors:
Zhengzhong Liu,
Guanxiong Ding,
Avinash Bukkittu,
Mansi Gupta,
Pengzhi Gao,
Atif Ahmed,
Shikun Zhang,
Xin Gao,
Swapnil Singhavi,
Linwei Li,
Wei Wei,
Zecong Hu,
Haoran Shi,
Haoying Zhang,
Xiaodan Liang,
Teruko Mitamura,
Eric P. Xing,
Zhiting Hu
Abstract:
Empirical natural language processing (NLP) systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization. We establish a unified open-source framework to support fast development of such sophisticated NLP workflows in a composable mann…
▽ More
Empirical natural language processing (NLP) systems in application domains (e.g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization. We establish a unified open-source framework to support fast development of such sophisticated NLP workflows in a composable manner. The framework introduces a uniform data representation to encode heterogeneous results by a wide range of NLP tasks. It offers a large repository of processors for NLP tasks, visualization, and annotation, which can be easily assembled with full interoperability under the unified representation. The highly extensible framework allows plugging in custom processors from external off-the-shelf NLP and deep learning libraries. The whole framework is delivered through two modularized yet integratable open-source projects, namely Forte (for workflow infrastructure and NLP function processors) and Stave (for user interaction, visualization, and annotation).
△ Less
Submitted 1 September, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
CP Symmetry and Symplectic Modular Invariance
Authors:
Gui-Jun Ding,
Ferruccio Feruglio,
Xiang-Gan Liu
Abstract:
We analyze CP symmetry in symplectic modular-invariant supersymmetric theories. We show that for genus $g\ge 3$ the definition of CP is unique, while two independent possibilities are allowed when $g\le 2$. We discuss the transformation properties of moduli, matter multiplets and modular forms in the Siegel upper half plane, as well as in invariant subspaces. We identify CP-conserving surfaces in…
▽ More
We analyze CP symmetry in symplectic modular-invariant supersymmetric theories. We show that for genus $g\ge 3$ the definition of CP is unique, while two independent possibilities are allowed when $g\le 2$. We discuss the transformation properties of moduli, matter multiplets and modular forms in the Siegel upper half plane, as well as in invariant subspaces. We identify CP-conserving surfaces in the fundamental domain of moduli space. We make use of all these elements to build a CP and symplectic invariant model of lepton masses and mixing angles, where known data are well reproduced and observable phases are predicted in terms of a minimum number of parameters.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Origins of minimized lattice thermal conductivity and enhanced thermoelectric performance in WS2/WSe2 lateral superlattice
Authors:
Yonglan Hu,
Tie Yang,
Dengfeng Li,
Guangqian Ding,
Chaochao Dun,
Dandan Wu,
Xiaotian Wang
Abstract:
We report a configuration strategy for improving the thermoelectric (TE) performance of two-dimensional (2D) transition metal dichalcogenide (TMDC) WS2 based on the experimentally prepared WS2/WSe2 lateral superlattice (LS) crystal. On the basis of density function theory combined with Boltzmann transport equation, we show that the TE figure of merit zT of monolayer WS2 is remarkably enhanced when…
▽ More
We report a configuration strategy for improving the thermoelectric (TE) performance of two-dimensional (2D) transition metal dichalcogenide (TMDC) WS2 based on the experimentally prepared WS2/WSe2 lateral superlattice (LS) crystal. On the basis of density function theory combined with Boltzmann transport equation, we show that the TE figure of merit zT of monolayer WS2 is remarkably enhanced when forming into a WS2/WSe2 LS crystal. This is primarily ascribed to the almost halved lattice thermal conductivity due to the enhanced anharmonic processes. Electronic transport properties parallel (xx) and perpendicular (yy) to the superlattice period are highly symmetric for both p- and n-doped LS owing to the nearly isotropic lifetime of charger carriers. The spin-orbital effect causes a significant split of conduction band and leads to three-fold degenerate sub-bands and high density of states (DOS), which offers opportunity to obtain the high n-type Seebeck coefficient (S). Interestingly, the separated degenerate sub-bands and upper conduction band in monolayer WS2 form a remarkable stairlike DOS, yielding a higher S. The hole carriers with much higher mobility than electrons reveal the high p-type power factor and the potential to be good p-type TE materials with optimal zT exceeds 1 at 400K in WS2/WSe2 LS.
△ Less
Submitted 31 January, 2021;
originally announced February 2021.
-
$SU(5)$ GUTs with $A_4$ modular symmetry
Authors:
Peng Chen,
Gui-Jun Ding,
Stephen F. King
Abstract:
We combine $SU(5)$ Grand Unified Theories (GUTs) with $A_4$ modular symmetry and present a comprehensive analysis of the resulting quark and lepton mass matrices for all the simplest cases. Classifying the models according to the representation assignments of the matter fields under $A_4$, we find that there are seven types of $SU(5)$ models with $A_4$ modular symmetry. We present 53 benchmark mod…
▽ More
We combine $SU(5)$ Grand Unified Theories (GUTs) with $A_4$ modular symmetry and present a comprehensive analysis of the resulting quark and lepton mass matrices for all the simplest cases. Classifying the models according to the representation assignments of the matter fields under $A_4$, we find that there are seven types of $SU(5)$ models with $A_4$ modular symmetry. We present 53 benchmark models with the fewest free parameters. The parameter space of each model is scanned to optimize the agreement between predictions and experimental data, and predictions for the masses and mixing parameters of quarks and leptons are given at the best fitting points. The best fit predictions for the leptonic CP violating Dirac phase, the lightest neutrino mass and the neutrinoless double beta decay parameter when displayed graphically are observed to cover a wide range of possible values, but are clustered around particular regions, allowing future neutrino experiments to discriminate between the different types of models.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
Authors:
Xin He,
Shihao Wang,
Xiaowen Chu,
Shaohuai Shi,
Jiang** Tang,
Xin Liu,
Chenggang Yan,
Jiyong Zhang,
Guiguang Ding
Abstract:
The COVID-19 pandemic has spread globally for several months. Because its transmissibility and high pathogenicity seriously threaten people's lives, it is crucial to accurately and quickly detect COVID-19 infection. Many recent studies have shown that deep learning (DL) based solutions can help detect COVID-19 based on chest CT scans. However, most existing work focuses on 2D datasets, which may r…
▽ More
The COVID-19 pandemic has spread globally for several months. Because its transmissibility and high pathogenicity seriously threaten people's lives, it is crucial to accurately and quickly detect COVID-19 infection. Many recent studies have shown that deep learning (DL) based solutions can help detect COVID-19 based on chest CT scans. However, most existing work focuses on 2D datasets, which may result in low quality models as the real CT scans are 3D images. Besides, the reported results span a broad spectrum on different datasets with a relatively unfair comparison. In this paper, we first use three state-of-the-art 3D models (ResNet3D101, DenseNet3D121, and MC3\_18) to establish the baseline performance on the three publicly available chest CT scan datasets. Then we propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification with the Gumbel Softmax technique to improve the searching efficiency. We further exploit the Class Activation Map** (CAM) technique on our models to provide the interpretability of the results. The experimental results show that our automatically searched models (CovidNet3D) outperform the baseline human-designed models on the three datasets with tens of times smaller model size and higher accuracy. Furthermore, the results also verify that CAM can be well applied in CovidNet3D for COVID-19 datasets to provide interpretability for medical diagnosis.
△ Less
Submitted 12 February, 2021; v1 submitted 13 January, 2021;
originally announced January 2021.
-
RepVGG: Making VGG-style ConvNets Great Again
Authors:
Xiaohan Ding,
Xiangyu Zhang,
Ningning Ma,
Jungong Han,
Guiguang Ding,
Jian Sun
Abstract:
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. Such decoupling of the training-time and inference-time architecture is realized by a structural re-parameterization technique so that the model is named RepVGG…
▽ More
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. Such decoupling of the training-time and inference-time architecture is realized by a structural re-parameterization technique so that the model is named RepVGG. On ImageNet, RepVGG reaches over 80% top-1 accuracy, which is the first time for a plain model, to the best of our knowledge. On NVIDIA 1080Ti GPU, RepVGG models run 83% faster than ResNet-50 or 101% faster than ResNet-101 with higher accuracy and show favorable accuracy-speed trade-off compared to the state-of-the-art models like EfficientNet and RegNet. The code and trained models are available at https://github.com/megvii-model/RepVGG.
△ Less
Submitted 29 March, 2021; v1 submitted 10 January, 2021;
originally announced January 2021.
-
Strengthened chain theorems for different versions of 4-connectivity
Authors:
Guoli Ding,
Chengfu Qin
Abstract:
The chain theorem of Tutte states that every 3-connected graph can be constructed from a wheel $W_n$ by repeatedly adding edges and splitting vertices. It is not difficult to prove the following strengthening of this theorem: every non-wheel 3-connected graph can be constructed from $W_4$ by repeatedly adding edges and splitting vertices. In this paper we similarly strengthen several chain theorem…
▽ More
The chain theorem of Tutte states that every 3-connected graph can be constructed from a wheel $W_n$ by repeatedly adding edges and splitting vertices. It is not difficult to prove the following strengthening of this theorem: every non-wheel 3-connected graph can be constructed from $W_4$ by repeatedly adding edges and splitting vertices. In this paper we similarly strengthen several chain theorems for various versions of 4-connectivity.
△ Less
Submitted 27 December, 2020;
originally announced December 2020.
-
Modular Invariant $A_{4}$ Models for Quarks and Leptons with Generalized CP Symmetry
Authors:
Chang-Yuan Yao,
Jun-Nan Lu,
Gui-Jun Ding
Abstract:
We perform a systematical analysis of the $A_4$ modular models with generalized CP for the masses and flavor mixing of quarks and leptons, and the most general form of the quark and lepton mass matrices is given. The CP invariance requires all couplings real in the chosen basis and thus the vacuum expectation value of the modulus $τ$ uniquely breaks both the modular symmetry and CP symmetry. The p…
▽ More
We perform a systematical analysis of the $A_4$ modular models with generalized CP for the masses and flavor mixing of quarks and leptons, and the most general form of the quark and lepton mass matrices is given. The CP invariance requires all couplings real in the chosen basis and thus the vacuum expectation value of the modulus $τ$ uniquely breaks both the modular symmetry and CP symmetry. The phenomenologically viable models with minimal number of free parameters and the results of fit are presented. We find 20 models with 7 real free parameters that can accommodate the experimental data of lepton sector. We then apply $A_4$ modular symmetry to the quark sector to explain quark masses and CKM mixing matrix, the minimal viable quark model is found to contain 10 free real parameters. Finally, we give two predictive quark-lepton unification models which use only 16 real free parameters to explain the flavor patterns of both quarks and leptons.
△ Less
Submitted 24 December, 2020;
originally announced December 2020.
-
Emotional Semantics-Preserved and Feature-Aligned CycleGAN for Visual Emotion Adaptation
Authors:
Sicheng Zhao,
Xuanbai Chen,
Xiangyu Yue,
Chuang Lin,
Pengfei Xu,
Ravi Krishna,
Jufeng Yang,
Guiguang Ding,
Alberto L. Sangiovanni-Vincentelli,
Kurt Keutzer
Abstract:
Thanks to large-scale labeled training data, deep neural networks (DNNs) have obtained remarkable success in many vision and multimedia tasks. However, because of the presence of domain shift, the learned knowledge of the well-trained DNNs cannot be well generalized to new domains or datasets that have few labels. Unsupervised domain adaptation (UDA) studies the problem of transferring models trai…
▽ More
Thanks to large-scale labeled training data, deep neural networks (DNNs) have obtained remarkable success in many vision and multimedia tasks. However, because of the presence of domain shift, the learned knowledge of the well-trained DNNs cannot be well generalized to new domains or datasets that have few labels. Unsupervised domain adaptation (UDA) studies the problem of transferring models trained on one labeled source domain to another unlabeled target domain. In this paper, we focus on UDA in visual emotion analysis for both emotion distribution learning and dominant emotion classification. Specifically, we design a novel end-to-end cycle-consistent adversarial model, termed CycleEmotionGAN++. First, we generate an adapted domain to align the source and target domains on the pixel-level by improving CycleGAN with a multi-scale structured cycle-consistency loss. During the image translation, we propose a dynamic emotional semantic consistency loss to preserve the emotion labels of the source images. Second, we train a transferable task classifier on the adapted domain with feature-level alignment between the adapted and target domains. We conduct extensive UDA experiments on the Flickr-LDL & Twitter-LDL datasets for distribution learning and ArtPhoto & FI datasets for emotion classification. The results demonstrate the significant improvements yielded by the proposed CycleEmotionGAN++ as compared to state-of-the-art UDA approaches.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Field-Tuned Quantum Effects in a Triangular-Lattice Ising Magnet
Authors:
Yayuan Qin,
Yao Shen,
Changle Liu,
Hongliang Wo,
Yonghao Gao,
Yu Feng,
Xiaowen Zhang,
Gaofeng Ding,
Yiqing Gu,
Qisi Wang,
Shoudong Shen,
Helen C. Walker,
Robert Bewley,
Jianhui Xu,
Martin Boehm,
Paul Steffens,
Seiko Ohira-Kawamura,
Naoki Murai,
Astrid Schneidewind,
Xin Tong,
Gang Chen,
Jun Zhao
Abstract:
We report thermodynamic and neutron scattering measurements of the triangular-lattice quantum Ising magnet TmMgGaO 4 in longitudinal magnetic fields. Our experiments reveal a quasi-plateau state induced by quantum fluctuations. This state exhibits an unconventional non-monotonic field and temperature dependence of the magnetic order and excitation gap. In the high field regime where the quantum fl…
▽ More
We report thermodynamic and neutron scattering measurements of the triangular-lattice quantum Ising magnet TmMgGaO 4 in longitudinal magnetic fields. Our experiments reveal a quasi-plateau state induced by quantum fluctuations. This state exhibits an unconventional non-monotonic field and temperature dependence of the magnetic order and excitation gap. In the high field regime where the quantum fluctuations are largely suppressed, we observed a disordered state with coherent magnon-like excitations despite the suppression of the spin excitation intensity. Through detailed semi-classical calculations, we are able to understand these behaviors quantitatively from the subtle competition between quantum fluctuations and frustrated Ising interactions.
△ Less
Submitted 12 September, 2021; v1 submitted 18 November, 2020;
originally announced November 2020.
-
CDT: Cascading Decision Trees for Explainable Reinforcement Learning
Authors:
Zihan Ding,
Pablo Hernandez-Leal,
Gavin Weiguang Ding,
Changjian Li,
Ruitong Huang
Abstract:
Deep Reinforcement Learning (DRL) has recently achieved significant advances in various domains. However, explaining the policy of RL agents still remains an open problem due to several factors, one being the complexity of explaining neural networks decisions. Recently, a group of works have used decision-tree-based models to learn explainable policies. Soft decision trees (SDTs) and discretized d…
▽ More
Deep Reinforcement Learning (DRL) has recently achieved significant advances in various domains. However, explaining the policy of RL agents still remains an open problem due to several factors, one being the complexity of explaining neural networks decisions. Recently, a group of works have used decision-tree-based models to learn explainable policies. Soft decision trees (SDTs) and discretized differentiable decision trees (DDTs) have been demonstrated to achieve both good performance and share the benefit of having explainable policies. In this work, we further improve the results for tree-based explainable RL in both performance and explainability. Our proposal, Cascading Decision Trees (CDTs) apply representation learning on the decision path to allow richer expressivity. Empirical results show that in both situations, where CDTs are used as policy function approximators or as imitation learners to explain black-box policies, CDTs can achieve better performances with more succinct and explainable models than SDTs. As a second contribution our study reveals limitations of explaining black-box policies via imitation learning with tree-based explainable models, due to its inherent instability.
△ Less
Submitted 30 March, 2021; v1 submitted 15 November, 2020;
originally announced November 2020.
-
Topological contextuality and anyonic statistics of photonic-encoded parafermions
Authors:
Zheng-Hao Liu,
Kai Sun,
Jiannis K. Pachos,
Mu Yang,
Yu Meng,
Yu-Wei Liao,
Qiang Li,
Jun-Feng Wang,
Ze-Yu Luo,
Yi-Fei He,
Dong-Yu Huang,
Guang-Rui Ding,
**-Shi Xu,
Yong-Jian Han,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
Quasiparticle poisoning, expected to arise during the measurement of Majorana zero mode state, poses a fundamental problem towards the realization of Majorana-based quantum computation. Parafermions, a natural generalization of Majorana fermions, can encode topological qudits immune to quasiparticle poisoning. While parafermions are expected to emerge in superconducting fractional quantum Hall sys…
▽ More
Quasiparticle poisoning, expected to arise during the measurement of Majorana zero mode state, poses a fundamental problem towards the realization of Majorana-based quantum computation. Parafermions, a natural generalization of Majorana fermions, can encode topological qudits immune to quasiparticle poisoning. While parafermions are expected to emerge in superconducting fractional quantum Hall systems, they are not yet attainable with current technology. To bypass this problem, we employ a photonic quantum simulator to experimentally demonstrate the key components of parafermion-based universal quantum computation. Our contributions in this article are twofold. First, by manipulating the photonic states, we realize Clifford operator Berry phases that correspond to braiding statistics of parafermions. Second, we investigate the quantum contextuality in a topological system for the first time by demonstrating the contextuality of parafermion encoded qudit states. Importantly, we find that the topologically-encoded contextuality opens the way to magic state distillation, while both the contextuality and the braiding-induced Clifford gates are resilient against local noise. By introducing contextuality, our photonic quantum simulation provides the first step towards a physically robust methodology for realizing topological quantum computation.
△ Less
Submitted 11 July, 2021; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Fermion Masses and Mixing from Double Cover and Metaplectic Cover of $A_5$ Modular Group
Authors:
Chang-Yuan Yao,
Xiang-Gan Liu,
Gui-Jun Ding
Abstract:
We perform a comprehensive study of the homogeneous finite modular group $A'_5$ which is the double covering of $A_5$. The integral weight and level 5 modular forms have been constructed up to weight 6 and they are decomposed into the irreducible representations of $A'_5$. Then we perform a systematical analysis of the $A'_5$ modular models for lepton masses and mixing. The phenomenologically viab…
▽ More
We perform a comprehensive study of the homogeneous finite modular group $A'_5$ which is the double covering of $A_5$. The integral weight and level 5 modular forms have been constructed up to weight 6 and they are decomposed into the irreducible representations of $A'_5$. Then we perform a systematical analysis of the $A'_5$ modular models for lepton masses and mixing. The phenomenologically viable models with minimal number of free parameters and the results of fit are presented. We find out 15 models with 9 real free parameters which can accommodate the experimental data of lepton sector. After including generalized CP symmetry, 9 viable models with 7 free parameters are found out. We apply $A'_5$ modular symmetry to the quark sector, and a quark-lepton unification model is given. The framework of modular invariance is extended to include the rational weight modular forms of level 5. The ring of modular forms at level 5 can be generated by two algebraically independent weight $1/5$ modular forms denoted by $F_1(τ)$ and $F_2(τ)$. We give the expressions of the rational weight modular forms of level 5 up to weight $3$ and arrange them into the irreducible multiplets of finite metaplectic group $\widetildeΓ_5\cong A'_5\times Z_5$. A neutrino mass model with $\widetildeΓ_5$ modular symmetry is presented, and the phenomenological predictions of the model are analyzed numerically.
△ Less
Submitted 18 May, 2021; v1 submitted 6 November, 2020;
originally announced November 2020.
-
Inverse centrifugal effect induced by collective motion of vortices in rotating turbulent convection
Authors:
Shan-Shan Ding,
Kai Leong Chong,
Jun-Qiang Shi,
Guang-Yu Ding,
Hao-Yuan Lu,
Ke-Qing Xia,
**-Qiang Zhong
Abstract:
When a fluid system is subject to strong rotation, centrifugal fluid motion is expected, i.e., denser (lighter) fluid moves outward (inward) from (toward) the axis of rotation. Here we demonstrate, both experimentally and numerically, the existence of an unexpected outward motion of warm and lighter vortices in rotating turbulent convection. This anomalous vortex motion occurs under rapid rotation…
▽ More
When a fluid system is subject to strong rotation, centrifugal fluid motion is expected, i.e., denser (lighter) fluid moves outward (inward) from (toward) the axis of rotation. Here we demonstrate, both experimentally and numerically, the existence of an unexpected outward motion of warm and lighter vortices in rotating turbulent convection. This anomalous vortex motion occurs under rapid rotations when the centrifugal buoyancy is sufficiently strong to induce a symmetry-breaking in the vorticity field, i.e., the vorticity of the cold anticyclones overrides that of the warm cyclones. We show that through hydrodynamic interactions the densely populated vortices can self-aggregate into coherent clusters and exhibit collective motion in this flow regime. Interestingly, the correlation of the vortex velocity fluctuations within a cluster is scale-free, with the correlation length being about 30% of the cluster length. Such long-range correlation leads to the collective outward motion of cyclones. Our study provides new understanding of vortex dynamics that are widely present in nature.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Construction and Application of Teaching System Based on Crowdsourcing Knowledge Graph
Authors:
**ta Weng,
Ying Gao,
**g Qiu,
Guozhu Ding,
Huanqin Zheng
Abstract:
Through the combination of crowdsourcing knowledge graph and teaching system, research methods to generate knowledge graph and its applications. Using two crowdsourcing approaches, crowdsourcing task distribution and reverse captcha generation, to construct knowledge graph in the field of teaching system. Generating a complete hierarchical knowledge graph of the teaching domain by nodes of school,…
▽ More
Through the combination of crowdsourcing knowledge graph and teaching system, research methods to generate knowledge graph and its applications. Using two crowdsourcing approaches, crowdsourcing task distribution and reverse captcha generation, to construct knowledge graph in the field of teaching system. Generating a complete hierarchical knowledge graph of the teaching domain by nodes of school, student, teacher, course, knowledge point and exercise type. The knowledge graph constructed in a crowdsourcing manner requires many users to participate collaboratively with fully consideration of teachers' guidance and users' mobilization issues. Based on the three subgraphs of knowledge graph, prominent teacher, student learning situation and suitable learning route could be visualized. Personalized exercises recommendation model is used to formulate the personalized exercise by algorithm based on the knowledge graph. Collaborative creation model is developed to realize the crowdsourcing construction mechanism. Though unfamiliarity with the learning mode of knowledge graph and learners' less attention to the knowledge structure, system based on Crowdsourcing Knowledge Graph can still get high acceptance around students and teachers
△ Less
Submitted 18 October, 2020;
originally announced October 2020.
-
Automorphic Forms and Fermion Masses
Authors:
Gui-Jun Ding,
Ferruccio Feruglio,
Xiang-Gan Liu
Abstract:
We extend the framework of modular invariant supersymmetric theories to encompass invariance under more general discrete groups $Γ$, that allow the presence of several moduli and make connection with the theory of automorphic forms. Moduli span a coset space $G/K$, where $G$ is a Lie group and $K$ is a compact subgroup of $G$, modded out by $Γ$. For a general choice of $G$, $K$, $Γ$ and a generic…
▽ More
We extend the framework of modular invariant supersymmetric theories to encompass invariance under more general discrete groups $Γ$, that allow the presence of several moduli and make connection with the theory of automorphic forms. Moduli span a coset space $G/K$, where $G$ is a Lie group and $K$ is a compact subgroup of $G$, modded out by $Γ$. For a general choice of $G$, $K$, $Γ$ and a generic matter content, we explicitly construct a minimal Kähler potential and a general superpotential, for both rigid and local $N=1$ supersymmetric theories. We also specialize our construction to the case $G=Sp(2g,R)$, $K=U(g)$ and $Γ=Sp(2g,Z)$, whose automorphic forms are Siegel modular forms. We show how our general theory can be consistently restricted to multi-dimensional regions of the moduli space enjoying residual symmetries. After choosing $g=2$, we present several examples of models for lepton and quark masses where Yukawa couplings are Siegel modular forms of level 2.
△ Less
Submitted 15 October, 2020;
originally announced October 2020.
-
Unavoidable Induced Subgraphs of Large 2-Connected Graphs
Authors:
Sarah Allred,
Guoli Ding,
Bogdan Oporowski
Abstract:
Ramsey proved that for every positive integer $n$, every sufficiently large graph contains an induced $K_n$ or $\overline{K}_n$. Among the many extensions of Ramsey's Theorem there is an analogue for connected graphs: for every positive integer $n$, every sufficiently large connected graph contains an induced $K_n$, $K_{1,n}$, or $P_n$. In this paper, we establish an analogue for 2-connected graph…
▽ More
Ramsey proved that for every positive integer $n$, every sufficiently large graph contains an induced $K_n$ or $\overline{K}_n$. Among the many extensions of Ramsey's Theorem there is an analogue for connected graphs: for every positive integer $n$, every sufficiently large connected graph contains an induced $K_n$, $K_{1,n}$, or $P_n$. In this paper, we establish an analogue for 2-connected graphs. In particular, we prove that for every integer exceeding two, every sufficiently large 2-connected graph contains one of the following as an induced subgraph: $K_n$, a subdivision of $K_{2,n}$, a subdivision of $K_{2,n}$ with an edge between the two vertices of degree $n$, and a well-defined structure similar to a ladder.
△ Less
Submitted 17 September, 2021; v1 submitted 25 September, 2020;
originally announced September 2020.
-
Trimaximal neutrino mixing from scotogenic $A_4$ family symmetry
Authors:
Gui-Jun Ding,
Jun-Nan Lu,
Jose W. F. Valle
Abstract:
We propose a flavour theory of leptons implementing an $A_4$ family symmetry. Our scheme provides a simple way to derive trimaximal neutrino mixing from first principles, leading to simple and testable predictions for neutrino mixing and CP violation. Dark matter mediates neutrino mass generation, as in the simplest scotogenic model.
We propose a flavour theory of leptons implementing an $A_4$ family symmetry. Our scheme provides a simple way to derive trimaximal neutrino mixing from first principles, leading to simple and testable predictions for neutrino mixing and CP violation. Dark matter mediates neutrino mass generation, as in the simplest scotogenic model.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
3D Spectrum Map** Based on ROI-Driven UAV Deployment
Authors:
Qihui Wu,
Feng Shen,
Zheng Wang,
Guoru Ding
Abstract:
Given the explosive growth of Internet of Things (IoT) devices ranging from the two-dimensional (2D) ground to the three-dimensional (3D) space, it is a necessity to establish a 3D spectrum map to comprehensively present and effectively manage the 3D spatial spectrum resources in smart city infrastructures. By leveraging the popularity and location flexibility of the unmanned aerial vehicles (UAVs…
▽ More
Given the explosive growth of Internet of Things (IoT) devices ranging from the two-dimensional (2D) ground to the three-dimensional (3D) space, it is a necessity to establish a 3D spectrum map to comprehensively present and effectively manage the 3D spatial spectrum resources in smart city infrastructures. By leveraging the popularity and location flexibility of the unmanned aerial vehicles (UAVs), we are able to execute spatial sampling with these emerging flying spectrum-monitoring devices (SMDs) at will. In this paper, we first present a brief survey to show the state-of-the-art studies on spectrum map**. Then, we introduce the 3D spectrum map** model. Next, we propose a 3D spectrum map** framework which is composed of pre-sampling, spectrum situation estimation, UAV deployment and spectrum recovery. Therein we develop a Region of Interest (ROI)-driven UAV deployment scheme, which selects new sampling points of the highest estimated interest and the lowest energy cost iteratively. Meanwhile, we slice the entire 3D spectrum map into a series of "images" and "repair" those unsampled locations. Furthermore, we provide an exemplary case study on the 3D spectrum map**, where, for example, an important event is being held and the entire spectrum situation needs to be monitored in real time to deal with malicious interference sources. Lastly, the challenges and open issues are discussed.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Cooperative Control of Mobile Robots with Stackelberg Learning
Authors:
Joewie J. Koh,
Guohui Ding,
Christoffer Heckman,
Lijun Chen,
Alessandro Roncone
Abstract:
Multi-robot cooperation requires agents to make decisions that are consistent with the shared goal without disregarding action-specific preferences that might arise from asymmetry in capabilities and individual objectives. To accomplish this goal, we propose a method named SLiCC: Stackelberg Learning in Cooperative Control. SLiCC models the problem as a partially observable stochastic game compose…
▽ More
Multi-robot cooperation requires agents to make decisions that are consistent with the shared goal without disregarding action-specific preferences that might arise from asymmetry in capabilities and individual objectives. To accomplish this goal, we propose a method named SLiCC: Stackelberg Learning in Cooperative Control. SLiCC models the problem as a partially observable stochastic game composed of Stackelberg bimatrix games, and uses deep reinforcement learning to obtain the payoff matrices associated with these games. Appropriate cooperative actions are then selected with the derived Stackelberg equilibria. Using a bi-robot cooperative object transportation problem, we validate the performance of SLiCC against centralized multi-agent Q-learning and demonstrate that SLiCC achieves better combined utility.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Generalized Perfect Optical Vortex along Arbitrary Trajectories
Authors:
Yue Chen,
Tingchang Wang,
Yuxuan Ren,
Zhaoxiang Fang,
Guangrui Ding,
Liqun He,
Rongde Lu,
Kun Huang
Abstract:
Perfect optical vortex (POV) is a type of vortex beam with an infinite thin ring and a fixed radius independent of its topological charge. Here we propose the concept of generalized perfect optical vortex along arbitrary curves beyond the regular shapes of circle and ellipse. Generalized perfect optical vortices also share the similar properties as POVs, such as defined only along infinite thin cu…
▽ More
Perfect optical vortex (POV) is a type of vortex beam with an infinite thin ring and a fixed radius independent of its topological charge. Here we propose the concept of generalized perfect optical vortex along arbitrary curves beyond the regular shapes of circle and ellipse. Generalized perfect optical vortices also share the similar properties as POVs, such as defined only along infinite thin curves and owning topological charges independent of scales. Notably, they naturally degenerate to the POVs and elliptic POVs along circles and ellipses, respectively. We also experimentally generated the generalized perfect optical vortices through a digital micromirror device (DMD) and measured the phase distributions by interferometry, exhibiting good agreements with the simulations. Moreover, we derive a proper modified formula to yield the generalized perfect optical vortices with uniform intensity distribution along predesigned curves. The generalized perfect optical vortices might find the potential applications in optical tweezers and communication.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Neural Kalman Filtering for Speech Enhancement
Authors:
Wei Xue,
Gang Quan,
Chao Zhang,
Guohong Ding,
Xiaodong He,
Bowen Zhou
Abstract:
Statistical signal processing based speech enhancement methods adopt expert knowledge to design the statistical models and linear filters, which is complementary to the deep neural network (DNN) based methods which are data-driven. In this paper, by using expert knowledge from statistical signal processing for network design and optimization, we extend the conventional Kalman filtering (KF) to the…
▽ More
Statistical signal processing based speech enhancement methods adopt expert knowledge to design the statistical models and linear filters, which is complementary to the deep neural network (DNN) based methods which are data-driven. In this paper, by using expert knowledge from statistical signal processing for network design and optimization, we extend the conventional Kalman filtering (KF) to the supervised learning scheme, and propose the neural Kalman filtering (NKF) for speech enhancement. Two intermediate clean speech estimates are first produced from recurrent neural networks (RNN) and linear Wiener filtering (WF) separately and are then linearly combined by a learned NKF gain to yield the NKF output. Supervised joint training is applied to NKF to learn to automatically trade-off between the instantaneous linear estimation made by the WF and the long-term non-linear estimation made by the RNN. The NKF method can be seen as using expert knowledge from WF to regularize the RNN estimations to improve its generalization ability to the noise conditions unseen in the training. Experiments in different noisy conditions show that the proposed method outperforms the baseline methods both in terms of objective evaluation metrics and automatic speech recognition (ASR) word error rates (WERs).
△ Less
Submitted 16 April, 2021; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Half-integral weight modular forms and application to neutrino mass models
Authors:
Xiang-Gan Liu,
Chang-Yuan Yao,
Bu-Yao Qu,
Gui-Jun Ding
Abstract:
We generalize the modular invariance approach to include the half-integral weight modular forms. Accordingly the modular group should be extended to its metaplectic covering group for consistency. We introduce the well-defined half-integral weight modular forms for congruence subgroup $Γ(4N)$ and show that they can be decomposed into the irreducible multiplets of finite metaplectic group…
▽ More
We generalize the modular invariance approach to include the half-integral weight modular forms. Accordingly the modular group should be extended to its metaplectic covering group for consistency. We introduce the well-defined half-integral weight modular forms for congruence subgroup $Γ(4N)$ and show that they can be decomposed into the irreducible multiplets of finite metaplectic group $\widetildeΓ_{4N}$. We construct concrete expressions of the half-integral/integral modular forms for $Γ(4)$ up to weight 6 and arrange them into the irreducible representations of $\widetildeΓ_4$. We present three typical models with $\widetildeΓ_4$ modular symmetry for neutrino masses and mixing, and the phenomenological predictions of each model are analyzed numerically.
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
-
Heat transport scaling and transition in geostrophic rotating convection with varying aspect ratio
Authors:
Hao-Yuan Lu,
Guang-Yu Ding,
Jun-Qiang Shi,
Ke-Qing Xia,
**-Qiang Zhong
Abstract:
We present high-precision experimental and numerical studies of the Nusselt number $Nu$ as functions of the Rayleigh number $Ra$ in geostrophic rotating convection with domain aspect ratio $Γ$ varying from 0.4 to 3.8 and the Ekman number Ek from $2.0{\times}10^{-7}$ to $2.7{\times}10^{-5}$. The heat-transport data $Nu(Ra)$ reveal a gradual transition from buoyancy-dominated to geostrophic convecti…
▽ More
We present high-precision experimental and numerical studies of the Nusselt number $Nu$ as functions of the Rayleigh number $Ra$ in geostrophic rotating convection with domain aspect ratio $Γ$ varying from 0.4 to 3.8 and the Ekman number Ek from $2.0{\times}10^{-7}$ to $2.7{\times}10^{-5}$. The heat-transport data $Nu(Ra)$ reveal a gradual transition from buoyancy-dominated to geostrophic convection at large $Ek$, whereas the transition becomes sharp with decreasing $Ek$. We determine the power-law scaling of $Nu{\sim}Ra^γ$, and show that the boundary flows give rise to pronounced enhancement of $Nu$ in a broad range of the geostrophic regime, leading to reduction of the scaling exponent $γ$ in small $Γ$ cells. The present work provides new insight into the heat-transport scaling in geostrophic convection and may explain the discrepancies observed in previous studies.
△ Less
Submitted 26 July, 2020;
originally announced July 2020.
-
ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting
Authors:
Xiaohan Ding,
Tianxiang Hao,
Jianchao Tan,
Ji Liu,
Jungong Han,
Yuchen Guo,
Guiguang Ding
Abstract:
We propose ResRep, a novel method for lossless channel pruning (a.k.a. filter pruning), which slims down a CNN by reducing the width (number of output channels) of convolutional layers. Inspired by the neurobiology research about the independence of remembering and forgetting, we propose to re-parameterize a CNN into the remembering parts and forgetting parts, where the former learn to maintain th…
▽ More
We propose ResRep, a novel method for lossless channel pruning (a.k.a. filter pruning), which slims down a CNN by reducing the width (number of output channels) of convolutional layers. Inspired by the neurobiology research about the independence of remembering and forgetting, we propose to re-parameterize a CNN into the remembering parts and forgetting parts, where the former learn to maintain the performance and the latter learn to prune. Via training with regular SGD on the former but a novel update rule with penalty gradients on the latter, we realize structured sparsity. Then we equivalently merge the remembering and forgetting parts into the original architecture with narrower layers. In this sense, ResRep can be viewed as a successful application of Structural Re-parameterization. Such a methodology distinguishes ResRep from the traditional learning-based pruning paradigm that applies a penalty on parameters to produce sparsity, which may suppress the parameters essential for the remembering. ResRep slims down a standard ResNet-50 with 76.15% accuracy on ImageNet to a narrower one with only 45% FLOPs and no accuracy drop, which is the first to achieve lossless pruning with such a high compression ratio. The code and models are at https://github.com/DingXiaoH/ResRep.
△ Less
Submitted 14 August, 2021; v1 submitted 7 July, 2020;
originally announced July 2020.
-
Modular Invariant Quark and Lepton Models in Double Covering of $S_4$ Modular Group
Authors:
Xiang-Gan Liu,
Chang-Yuan Yao,
Gui-Jun Ding
Abstract:
We perform a comprehensive analysis of the homogeneous finite modular group $Γ'_4\equiv S'_4$ which is the double covering of $S_4$ group. The weight 1 modular forms of level 4 are constructed in terms of Dedekind eta function, and they transform as a triplet $\mathbf{\hat{3}'}$ of $S'_4$. The integral weight modular forms until weight 6 are built from the tensor products of weight 1 modular forms…
▽ More
We perform a comprehensive analysis of the homogeneous finite modular group $Γ'_4\equiv S'_4$ which is the double covering of $S_4$ group. The weight 1 modular forms of level 4 are constructed in terms of Dedekind eta function, and they transform as a triplet $\mathbf{\hat{3}'}$ of $S'_4$. The integral weight modular forms until weight 6 are built from the tensor products of weight 1 modular forms. We perform a systematical classification of $S'_4$ modular models for lepton masses and mixing with/without generalized CP, where the left-handed leptons are assigned to triplet of $S'_4$ and right-handed charged leptons transform as singlets under $S'_4$, and we consider both scenarios where the neutrino masses arise from Weinberg operator or type I seesaw mechanism. The phenomenological implications of the minimal models for lepton masses, mixing angles, CP violation phases and neutrinoless double decay are discussed. The $S'_4$ modular symmetry is extended to quark sector, we present several predictive models which use nine or ten free parameters including real and imaginary parts of $τ$ to describe quark masses and Cabibbo-Kobayashi-Maskawa mixing matrix. We give a quark-lepton unified model which can explain the flavor structure of quarks and leptons simultaneously for a common value of $τ$.
△ Less
Submitted 23 March, 2021; v1 submitted 18 June, 2020;
originally announced June 2020.
-
Shallow Feature Based Dense Attention Network for Crowd Counting
Authors:
Yunqi Miao,
Zijia Lin,
Guiguang Ding,
Jungong Han
Abstract:
While the performance of crowd counting via deep learning has been improved dramatically in the recent years, it remains an ingrained problem due to cluttered backgrounds and varying scales of people within an image. In this paper, we propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images, which diminishes the impact of backgrounds via involving a sha…
▽ More
While the performance of crowd counting via deep learning has been improved dramatically in the recent years, it remains an ingrained problem due to cluttered backgrounds and varying scales of people within an image. In this paper, we propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images, which diminishes the impact of backgrounds via involving a shallow feature based attention model, and meanwhile, captures multi-scale information via densely connecting hierarchical image features. Specifically, inspired by the observation that backgrounds and human crowds generally have noticeably different responses in shallow features, we decide to build our attention model upon shallow-feature maps, which results in accurate background-pixel detection. Moreover, considering that the most representative features of people across different scales can appear in different layers of a feature extraction network, to better keep them all, we propose to densely connect hierarchical image features of different layers and subsequently encode them for estimating crowd density. Experimental results on three benchmark datasets clearly demonstrate the superiority of SDANet when dealing with different scenarios. Particularly, on the challenging UCF CC 50 dataset, our method outperforms other existing methods by a large margin, as is evident from a remarkable 11.9% Mean Absolute Error (MAE) drop of our SDANet.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
Modular Invariant Models of Leptons at Level 7
Authors:
Gui-Jun Ding,
Stephen F. King,
Cai-Chang Li,
Ye-Ling Zhou
Abstract:
We consider for the first time level 7 modular invariant flavour models where the lepton mixing originates from the breaking of modular symmetry and couplings responsible for lepton masses are modular forms. The latter are decomposed into irreducible multiplets of the finite modular group $Γ_7$, which is isomorphic to $PSL(2,Z_{7})$, the projective special linear group of two dimensional matrices…
▽ More
We consider for the first time level 7 modular invariant flavour models where the lepton mixing originates from the breaking of modular symmetry and couplings responsible for lepton masses are modular forms. The latter are decomposed into irreducible multiplets of the finite modular group $Γ_7$, which is isomorphic to $PSL(2,Z_{7})$, the projective special linear group of two dimensional matrices over the finite Galois field of seven elements, containing 168 elements, sometimes written as $PSL_2(7)$ or $Σ(168)$. At weight 2, there are 26 linearly independent modular forms, organised into a triplet, a septet and two octets of $Γ_7$. A full list of modular forms up to weight 8 are provided. Assuming the absence of flavons, the simplest modular-invariant models based on $Γ_7$ are constructed, in which neutrinos gain masses via either the Weinberg operator or the type-I seesaw mechanism, and their predictions compared to experiment.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
Online MinCut: Competitive and Regret Analysis
Authors:
Avah Banerjee,
Guoli Ding
Abstract:
In this paper we study the mincut problem in the online setting. We consider two distinct models: A) competitive analysis and B) regret analysis. In the competitive setting we consider the vertex arrival model; whenever a new vertex arrives it's neighborhood with respect to the set of known vertices is revealed. An online algorithm must make an irrevocable decision to determine the side of the cut…
▽ More
In this paper we study the mincut problem in the online setting. We consider two distinct models: A) competitive analysis and B) regret analysis. In the competitive setting we consider the vertex arrival model; whenever a new vertex arrives it's neighborhood with respect to the set of known vertices is revealed. An online algorithm must make an irrevocable decision to determine the side of the cut that the vertex must belong to in order to minimize the size of the final cut. Various models are considered. 1) For classical and advice models we give tight bounds on the competitive ratio of deterministic algorithms. 2) Next we consider few semi-adversarial inputs: random order of arrival with adversarially generated and sparse graphs. 3) Lastly we derive some structural properties of \mc-type problems with respect to greedy strategies.
Finally we consider a non-stationary regret setting with a variational budget $V_T$ and give tights bounds on the regret function. Specifically, we show that if $V_T$ is sublinear in $T$ (number of rounds) then there is a deterministic algorithm achieving a sublinear regret bound ($O(V_T)$). Further, this is optimal, even if randomization is allowed.
△ Less
Submitted 14 August, 2020; v1 submitted 25 April, 2020;
originally announced April 2020.
-
Testing Moduli and Flavon Dynamics with Neutrino Oscillations
Authors:
Gui-Jun Ding,
Ferruccio Feruglio
Abstract:
We study scalar Non-Standard Neutrino Interactions (NSI) induced by moduli or flavon exchange between electrons and neutrinos. In a region with non-vanishing electron number density, they are known to determine a shift of the neutrino mass matrix. We review and extend the relevant formalism, and we update the existing limits on electron and neutrino scalar couplings. We explore the observability o…
▽ More
We study scalar Non-Standard Neutrino Interactions (NSI) induced by moduli or flavon exchange between electrons and neutrinos. In a region with non-vanishing electron number density, they are known to determine a shift of the neutrino mass matrix. We review and extend the relevant formalism, and we update the existing limits on electron and neutrino scalar couplings. We explore the observability of scalar NSI in models of lepton masses based on flavour symmetries. We analyze models where the scalar couplings are constrained either by abelian symmetries or by modular invariance. We highlight regions of the parameter space where observable effects can occur.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
0.71-Å resolution electron tomography enabled by deep learning aided information recovery
Authors:
Chunyang Wang,
Guanglei Ding,
Yitong Liu,
Huolin L. Xin
Abstract:
Electron tomography, as an important 3D imaging method, offers a powerful method to probe the 3D structure of materials from the nano- to the atomic-scale. However, as a grant challenge, radiation intolerance of the nanoscale samples and the missing-wedge-induced information loss and artifacts greatly hindered us from obtaining 3D atomic structures with high fidelity. Here, for the first time, by…
▽ More
Electron tomography, as an important 3D imaging method, offers a powerful method to probe the 3D structure of materials from the nano- to the atomic-scale. However, as a grant challenge, radiation intolerance of the nanoscale samples and the missing-wedge-induced information loss and artifacts greatly hindered us from obtaining 3D atomic structures with high fidelity. Here, for the first time, by combining generative adversarial models with state-of-the-art network architectures, we demonstrate the resolution of electron tomography can be improved to 0.71 angstrom which is the highest three-dimensional imaging resolution that has been reported thus far. We also show it is possible to recover the lost information and remove artifacts in the reconstructed tomograms by only acquiring data from -50 to +50 degrees (44% reduction of dosage compared to -90 to +90 degrees full tilt series). In contrast to conventional methods, the deep learning model shows outstanding performance for both macroscopic objects and atomic features solving the long-standing dosage and missing-wedge problems in electron tomography. Our work provides important guidance for the application of machine learning methods to tomographic imaging and sheds light on its applications in other 3D imaging techniques.
△ Less
Submitted 27 March, 2020;
originally announced March 2020.
-
Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation
Authors:
Guohui Ding,
Joewie J. Koh,
Kelly Merckaert,
Bram Vanderborght,
Marco M. Nicotra,
Christoffer Heckman,
Alessandro Roncone,
Lijun Chen
Abstract:
We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL). We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL), where each agent applies Q-learning with individual reward functions; and game-theoretic RL (GT-RL), where the agents update their Q-values based on the Nash equilibrium of a bimatrix Q-value game. We val…
▽ More
We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL). We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL), where each agent applies Q-learning with individual reward functions; and game-theoretic RL (GT-RL), where the agents update their Q-values based on the Nash equilibrium of a bimatrix Q-value game. We validate the proposed approaches in the setting of cooperative object manipulation with two simulated robot arms. Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
△ Less
Submitted 20 March, 2020;
originally announced March 2020.
-
PANDA: A Gigapixel-level Human-centric Video Dataset
Authors:
Xueyang Wang,
Xiya Zhang,
Yinheng Zhu,
Yuchen Guo,
Xiaoyun Yuan,
Liuyu Xiang,
Zerun Wang,
Guiguang Ding,
David J Brady,
Qionghai Dai,
Lu Fang
Abstract:
We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (~1 square kilometer area) and high-resolution details (~gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. P…
▽ More
We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (~1 square kilometer area) and high-resolution details (~gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.
△ Less
Submitted 10 March, 2020;
originally announced March 2020.
-
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
Authors:
Hui Chen,
Guiguang Ding,
Xudong Liu,
Zijia Lin,
Ji Liu,
Jungong Han
Abstract:
Enabling bi-directional retrieval of images and texts is important for understanding the correspondence between vision and language. Existing methods leverage the attention mechanism to explore such correspondence in a fine-grained manner. However, most of them consider all semantics equally and thus align them uniformly, regardless of their diverse complexities. In fact, semantics are diverse (i.…
▽ More
Enabling bi-directional retrieval of images and texts is important for understanding the correspondence between vision and language. Existing methods leverage the attention mechanism to explore such correspondence in a fine-grained manner. However, most of them consider all semantics equally and thus align them uniformly, regardless of their diverse complexities. In fact, semantics are diverse (i.e. involving different kinds of semantic concepts), and humans usually follow a latent structure to combine them into understandable languages. It may be difficult to optimally capture such sophisticated correspondences in existing methods. In this paper, to address such a deficiency, we propose an Iterative Matching with Recurrent Attention Memory (IMRAM) method, in which correspondences between images and texts are captured with multiple steps of alignments. Specifically, we introduce an iterative matching scheme to explore such fine-grained correspondence progressively. A memory distillation unit is used to refine alignment knowledge from early steps to later ones. Experiment results on three benchmark datasets, i.e. Flickr8K, Flickr30K, and MS COCO, show that our IMRAM achieves state-of-the-art performance, well demonstrating its effectiveness. Experiments on a practical business advertisement dataset, named \Ads{}, further validates the applicability of our method in practical scenarios.
△ Less
Submitted 8 March, 2020;
originally announced March 2020.
-
Predictions from warped flavordynamics based on the $T'$ family group
Authors:
Peng Chen,
Gui-Jun Ding,
Jun-Nan Lu,
José W. F. Valle
Abstract:
We propose a realistic theory of fermion masses and mixings using a five-dimensional warped scenario where all fermions propagate in the bulk and the Higgs field is localized on the IR brane. The assumed $T'$ flavor symmetry is broken on the branes by flavon fields, providing a consistent scenario where fermion mass hierarchies arise from adequate choices of the bulk mass parameters, while quark a…
▽ More
We propose a realistic theory of fermion masses and mixings using a five-dimensional warped scenario where all fermions propagate in the bulk and the Higgs field is localized on the IR brane. The assumed $T'$ flavor symmetry is broken on the branes by flavon fields, providing a consistent scenario where fermion mass hierarchies arise from adequate choices of the bulk mass parameters, while quark and lepton mixing angles are restricted by the family symmetry. Neutrino mass splittings, mixing parameters and the Dirac CP phase all arise from the type-I seesaw mechanism and are tightly correlated, leading to predictions for the neutrino oscillation parameters, as well as expected \znbb decay rates within reach of upcoming experiments. The scheme also provides a good global description of flavor observables in the quark sector.
△ Less
Submitted 7 December, 2020; v1 submitted 5 March, 2020;
originally announced March 2020.
-
Independent wavefront tailoring in full polarization channels by helicity-decoupled metasurface
Authors:
Guowen Ding,
Ke Chen,
Na Zhang,
Junming Zhao,
Tian Jiang,
Yijun Feng
Abstract:
Controlling the polarization and wavefront of light is essential for compact photonic systems in modern science and technology. This may be achieved by metasurfaces, a new platform that has radically changed the way people engineer wave-matter interactions. However, it still remains very challenging to generate versatile beams with arbitrary and independent wavefronts in each polarization channel…
▽ More
Controlling the polarization and wavefront of light is essential for compact photonic systems in modern science and technology. This may be achieved by metasurfaces, a new platform that has radically changed the way people engineer wave-matter interactions. However, it still remains very challenging to generate versatile beams with arbitrary and independent wavefronts in each polarization channel by a single ultrathin metasurface. By modulating both the geometric and propagation phases of the metasurface, here we propose a method that can generate an assembly of circularly- and linearly-polarized beams with simultaneously the capability of independent encoding desired wavefront to each individual polarization channel, which we believe will greatly enhance the information capacities of the meta-devices. Two proof-of-concept designs are experimentally demonstrated in microwave region. Upon the excitation of an arbitrary linear polarization, the first device can generate distinct vortex beams with desired two linear and two circular orthogonal polarizations, whereas the second one can generate multi-foci containing components of full polarizations. This approach to generate versatile polarizations with tailored wavefront may pave a way to achieve advanced, flat and multifunctional meta-device for integrated systems.
△ Less
Submitted 12 January, 2020;
originally announced January 2020.
-
Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification
Authors:
Liuyu Xiang,
Guiguang Ding,
Jungong Han
Abstract:
In real-world scenarios, data tends to exhibit a long-tailed distribution, which increases the difficulty of training deep networks. In this paper, we propose a novel self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME). Our method is inspired by the observation that networks trained on less imbalanced subsets of the distribution often yield better performances…
▽ More
In real-world scenarios, data tends to exhibit a long-tailed distribution, which increases the difficulty of training deep networks. In this paper, we propose a novel self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME). Our method is inspired by the observation that networks trained on less imbalanced subsets of the distribution often yield better performances than their jointly-trained counterparts. We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model. Specifically, the proposed framework involves two levels of adaptive learning schedules: Self-paced Expert Selection and Curriculum Instance Selection, so that the knowledge is adaptively transferred to the 'Student'. We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods. We also show that our method can be easily plugged into state-of-the-art long-tailed classification algorithms for further improvements.
△ Less
Submitted 20 September, 2020; v1 submitted 6 January, 2020;
originally announced January 2020.
-
Modular symmetry origin of texture zeros and quark lepton unification
Authors:
Jun-Nan Lu,
Xiang-Gan Liu,
Gui-Jun Ding
Abstract:
The even weight modular forms of level $N$ can be arranged into the common irreducible representations of the inhomogeneous finite modular group $Γ_N$ and the homogeneous finite modular group $Γ'_N$ which is the double covering of $Γ_N$, and the odd weight modular forms of level $N$ transform in the new representations of $Γ'_N$. We find that the above structure of modular forms can naturally gene…
▽ More
The even weight modular forms of level $N$ can be arranged into the common irreducible representations of the inhomogeneous finite modular group $Γ_N$ and the homogeneous finite modular group $Γ'_N$ which is the double covering of $Γ_N$, and the odd weight modular forms of level $N$ transform in the new representations of $Γ'_N$. We find that the above structure of modular forms can naturally generate texture zeros of the fermion mass matrices if we properly assign the representations and weights of the matter fields under the modular group. We perform a comprehensive analysis for the $Γ'_3\cong T'$ modular symmetry. The three generations of left-handed quarks are assumed to transform as a doublet and a singlet of $T'$, we find six possible texture zeros structures of quark mass matrix up to row and column permutations. We present five benchmark quark models which can produce very good fit to the experimental data. These quark models are further extended to include lepton sector, the resulting models can give a unified description of both quark and lepton masses and flavor mixing simultaneously although they contain less number of free parameters than the observables.
△ Less
Submitted 29 September, 2021; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Cascaded Deep Neural Networks for Retinal Layer Segmentation of Optical Coherence Tomography with Fluid Presence
Authors:
Donghuan Lu,
Morgan Heisler,
Da Ma,
Setareh Dabiri,
Sieun Lee,
Gavin Weiguang Ding,
Marinko V. Sarunic,
Mirza Faisal Beg
Abstract:
Optical coherence tomography (OCT) is a non-invasive imaging technology which can provide micrometer-resolution cross-sectional images of the inner structures of the eye. It is widely used for the diagnosis of ophthalmic diseases with retinal alteration, such as layer deformation and fluid accumulation. In this paper, a novel framework was proposed to segment retinal layers with fluid presence. Th…
▽ More
Optical coherence tomography (OCT) is a non-invasive imaging technology which can provide micrometer-resolution cross-sectional images of the inner structures of the eye. It is widely used for the diagnosis of ophthalmic diseases with retinal alteration, such as layer deformation and fluid accumulation. In this paper, a novel framework was proposed to segment retinal layers with fluid presence. The main contribution of this study is two folds: 1) we developed a cascaded network framework to incorporate the prior structural knowledge; 2) we proposed a novel deep neural network based on U-Net and fully convolutional network, termed LF-UNet. Cross validation experiments proved that the proposed LF-UNet has superior performance comparing with the state-of-the-art methods, and incorporating the relative distance map structural prior information could further improve the performance regardless the network.
△ Less
Submitted 6 December, 2019;
originally announced December 2019.
-
Modular $S_4$ and $A_4$ Symmetries and Their Fixed Points: New Predictive Examples of Lepton Mixing
Authors:
Gui-Jun Ding,
Stephen F. King,
Xiang-Gan Liu,
Jun-Nan Lu
Abstract:
In the modular symmetry approach to neutrino models, the flavour symmetry emerges as a finite subgroup $Γ_N$ of the modular symmetry, broken by the vacuum expectation value (VEV) of a modulus field $τ$. If the VEV of the modulus $τ$ takes some special value, a residual subgroup of $Γ_N$ would be preserved. We derive the fixed points $τ_S=i$, $τ_{ST}=(-1+i\sqrt{3})/2$, $τ_{TS}=(1+i\sqrt{3})/2$,…
▽ More
In the modular symmetry approach to neutrino models, the flavour symmetry emerges as a finite subgroup $Γ_N$ of the modular symmetry, broken by the vacuum expectation value (VEV) of a modulus field $τ$. If the VEV of the modulus $τ$ takes some special value, a residual subgroup of $Γ_N$ would be preserved. We derive the fixed points $τ_S=i$, $τ_{ST}=(-1+i\sqrt{3})/2$, $τ_{TS}=(1+i\sqrt{3})/2$, $τ_T=i\infty$ in the fundamental domain which are invariant under the modular transformations indicated. We then generalise these fixed points to $τ_f=γτ_S$, $γτ_{ST}$, $γτ_{TS}$ and $γτ_{T}$ in the upper half complex plane, and show that it is sufficient to consider $γ\inΓ_{N}$. Focussing on level $N=4$, corresponding to the flavour group $S_4$, we consider all the resulting triplet modular forms at these fixed points up to weight 6. We then apply the results to lepton mixing, with different residual subgroups in the charged lepton sector and each of the right-handed neutrinos sectors. In the minimal case of two right-handed neutrinos, we find three phenomenologically viable cases in which the light neutrino mass matrix only depends on three free parameters, and the lepton mixing takes the trimaximal TM1 pattern for two examples. One of these cases corresponds to a new Littlest Modular Seesaw based on CSD$(n)$ with $n=1+\sqrt{6}\approx 3.45$, intermediate between CSD$(3)$ and CSD$(4)$. Finally, we generalize the results to examples with three right-handed neutrinos, also considering the level $N=3$ case, corresponding to $A_4$ flavour symmetry.
△ Less
Submitted 13 October, 2019; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Towards Scalable Koopman Operator Learning: Convergence Rates and A Distributed Learning Algorithm
Authors:
Zhiyuan Liu,
Guohui Ding,
Lijun Chen,
Enoch Yeung
Abstract:
We propose an alternating optimization algorithm to the nonconvex Koopman operator learning problem for nonlinear dynamic systems. We show that the proposed algorithm will converge to a critical point with rate $O(1/T)$ and $O(\frac{1}{\log T})$ for the constant and diminishing learning rates, respectively, under some mild conditions. To cope with the high dimensional nonlinear dynamical systems,…
▽ More
We propose an alternating optimization algorithm to the nonconvex Koopman operator learning problem for nonlinear dynamic systems. We show that the proposed algorithm will converge to a critical point with rate $O(1/T)$ and $O(\frac{1}{\log T})$ for the constant and diminishing learning rates, respectively, under some mild conditions. To cope with the high dimensional nonlinear dynamical systems, we present the first-ever distributed Koopman operator learning algorithm. We show that the distributed Koopman operator learning has the same convergence properties as the centralized Koopman operator learning, in the absence of optimal tracker, so long as the basis functions satisfy a set of state-based decomposition conditions. Numerical experiments are provided to complement our theoretical results.
△ Less
Submitted 20 March, 2020; v1 submitted 30 September, 2019;
originally announced September 2019.
-
Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Authors:
Xiaohan Ding,
Guiguang Ding,
Xiangxin Zhou,
Yuchen Guo,
Jungong Han,
Ji Liu
Abstract:
Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. DNN pruning is an approach for deep model compression, which aims at eliminating some parameters with tolerable performance degradation. In this paper, we propose a novel momentum-SGD-based optimization method to reduce the network c…
▽ More
Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. DNN pruning is an approach for deep model compression, which aims at eliminating some parameters with tolerable performance degradation. In this paper, we propose a novel momentum-SGD-based optimization method to reduce the network complexity by on-the-fly pruning. Concretely, given a global compression ratio, we categorize all the parameters into two parts at each training iteration which are updated using different rules. In this way, we gradually zero out the redundant parameters, as we update them using only the ordinary weight decay but no gradients derived from the objective function. As a departure from prior methods that require heavy human works to tune the layer-wise sparsity ratios, prune by solving complicated non-differentiable problems or finetune the model after pruning, our method is characterized by 1) global compression that automatically finds the appropriate per-layer sparsity ratios; 2) end-to-end training; 3) no need for a time-consuming re-training process after pruning; and 4) superior capability to find better winning tickets which have won the initialization lottery.
△ Less
Submitted 25 October, 2019; v1 submitted 27 September, 2019;
originally announced September 2019.