-
On the Vertical Shear Instability in Magnetized Protoplanetary Disks
Authors:
Can Cui,
Min-Kai Lin
Abstract:
The vertical shear instability (VSI) is a robust phenomenon in irradiated protoplanetary disks (PPDs). While there is extensive literature on the VSI in the hydrodynamic limit, PPDs are expected to be magnetized and their extremely low ionization fractions imply that non-ideal magneto-hydrodynamic (MHD) effects should be properly considered. To this end, we present linear analyses of the VSI in ma…
▽ More
The vertical shear instability (VSI) is a robust phenomenon in irradiated protoplanetary disks (PPDs). While there is extensive literature on the VSI in the hydrodynamic limit, PPDs are expected to be magnetized and their extremely low ionization fractions imply that non-ideal magneto-hydrodynamic (MHD) effects should be properly considered. To this end, we present linear analyses of the VSI in magnetized disks with Ohmic resistivity. We primarily consider toroidal magnetic fields, which are likely to dominate the field geometry in PPDs. We perform vertically global and radially local analyses to capture characteristic VSI modes with extended vertical structures. To focus on the effect of magnetism, we use a locally isothermal equation of state. We find that magnetism provides a stabilizing effect to dampen the VSI, with surface modes, rather than body modes, being the first to vanish with increasing magnetization. Subdued VSI modes can be revived by Ohmic resistivity, where sufficient magnetic diffusion overcome magnetic stabilization, and hydrodynamic results are recovered. We also briefly consider poloidal fields to account for the magnetorotational instability (MRI), which may develop towards surface layers in the outer parts of PPDs. The MRI grows efficiently at small radial wavenumbers, in contrast to the VSI. When resistivity is considered, we find the VSI dominates over the MRI for Ohmic Elsässer numbers $\lesssim 0.09$ at plasma beta parameter $β_Z \sim 10^4$.
△ Less
Submitted 24 May, 2021;
originally announced May 2021.
-
Salient Feature Extractor for Adversarial Defense on Deep Neural Networks
Authors:
**yin Chen,
Ruoxi Chen,
Haibin Zheng,
Zhaoyan Ming,
Wenrong Jiang,
Chen Cui
Abstract:
Recent years have witnessed unprecedented success achieved by deep learning models in the field of computer vision. However, their vulnerability towards carefully crafted adversarial examples has also attracted the increasing attention of researchers. Motivated by the observation that adversarial examples are due to the non-robust feature learned from the original dataset by models, we propose the…
▽ More
Recent years have witnessed unprecedented success achieved by deep learning models in the field of computer vision. However, their vulnerability towards carefully crafted adversarial examples has also attracted the increasing attention of researchers. Motivated by the observation that adversarial examples are due to the non-robust feature learned from the original dataset by models, we propose the concepts of salient feature(SF) and trivial feature(TF). The former represents the class-related feature, while the latter is usually adopted to mislead the model. We extract these two features with coupled generative adversarial network model and put forward a novel detection and defense method named salient feature extractor (SFE) to defend against adversarial attacks. Concretely, detection is realized by separating and comparing the difference between SF and TF of the input. At the same time, correct labels are obtained by re-identifying SF to reach the purpose of defense. Extensive experiments are carried out on MNIST, CIFAR-10, and ImageNet datasets where SFE shows state-of-the-art results in effectiveness and efficiency compared with baselines. Furthermore, we provide an interpretable understanding of the defense and detection process.
△ Less
Submitted 14 May, 2021;
originally announced May 2021.
-
VPP-ART: An Efficient Implementation of Fixed-Size-Candidate-Set Adaptive Random Testing using Vantage Point Partitioning
Authors:
Rubing Huang,
Chenhui Cui,
Dave Towey,
Weifeng Sun,
Junlong Lian
Abstract:
Adaptive Random Testing (ART) is an enhancement of Random Testing (RT), and aims to improve the RT failure-detection effectiveness by distributing test cases more evenly in the input domain. Many ART algorithms have been proposed, with Fixed-Size-Candidate-Set ART (FSCS-ART) being one of the most effective and popular. FSCS-ART ensures high failure-detection effectiveness by selecting the next tes…
▽ More
Adaptive Random Testing (ART) is an enhancement of Random Testing (RT), and aims to improve the RT failure-detection effectiveness by distributing test cases more evenly in the input domain. Many ART algorithms have been proposed, with Fixed-Size-Candidate-Set ART (FSCS-ART) being one of the most effective and popular. FSCS-ART ensures high failure-detection effectiveness by selecting the next test case as the candidate farthest from previously-executed test cases. Although FSCS-ART has good failure-detection effectiveness, it also faces some challenges, including heavy computational overheads. In this paper, we propose an enhanced version of FSCS-ART, Vantage Point Partitioning ART (VPP-ART). VPP-ART addresses the FSCS-ART computational overhead problem using vantage point partitioning, while maintaining the failure-detection effectiveness. VPP-ART partitions the input domain space using a modified Vantage Point tree (VP-tree) and finds the approximate nearest executed test cases of a candidate test case in the partitioned sub-domains -- thereby significantly reducing the time overheads compared with the searches required for FSCS-ART. To enable the FSCS-ART dynamic insertion process, we modify the traditional VP-tree to support dynamic data. The simulation results show that VPP-ART has a much lower time overhead compared to FSCS-ART, but also delivers similar (or better) failure-detection effectiveness, especially in the higher dimensional input domains. According to statistical analyses, VPP-ART can improve on the FSCS-ART failure-detection effectiveness by approximately 50% to 58%. VPP-ART also compares favorably with the KDFC-ART algorithms (a series of enhanced ART algorithms based on the KD-tree). Our experiments also show that VPP-ART is more cost-effective than FSCS-ART and KDFC-ART.
△ Less
Submitted 6 December, 2021; v1 submitted 12 May, 2021;
originally announced May 2021.
-
Perovskite-type YRh$_{3}$B with multiple types of nodal point and nodal line states
Authors:
Feng Zhou,
Chaoxi Cui,
Jianhua Wang,
Minquan Kuang,
Tie Yang,
Zhi-Ming Yu,
Xiaotian Wang,
Gang Zhang,
Zhenxiang Cheng
Abstract:
Experimentally synthesized perovskite-type YRh$_{3}$B with a $Pm\bar{3}m$ type structure was proposed as a novel topological material (TM) via first-principles calculations and the low-energy $k\cdot p$ effective Hamiltonian, which has a quadratic contact triple point (QCTP) at point $Γ$ and six pairs of open nodal lines (NLs) of the hybrid type. Clear surface states observed in the surface spectr…
▽ More
Experimentally synthesized perovskite-type YRh$_{3}$B with a $Pm\bar{3}m$ type structure was proposed as a novel topological material (TM) via first-principles calculations and the low-energy $k\cdot p$ effective Hamiltonian, which has a quadratic contact triple point (QCTP) at point $Γ$ and six pairs of open nodal lines (NLs) of the hybrid type. Clear surface states observed in the surface spectrum confirmed the topological states. When spin-orbit coupling was considered, the QCTP at $Γ$ transferred to the quadratic-type Dirac nodal point (NP). Under 1$\%$ tetragonal strained lattice constants, YRh$_{3}$B hosted richer topological states, including a quadratic-type two-fold degenerate NP, six pairs of open NLs of the hybrid type, and two closed NLs of type I and hybrid type. Moreover, it was proved that the NLs of YRh$_{3}$B at its strained lattice constants contain all types of band-crossing points (BCPs) (i.e., type I, type II, and critical type). Such rich types of NP and NL states in one compound make it potentially applicable for multifunctional electronic devices as well as an appropriate platform to study entanglement among topological states.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Directional FDR Control for Sub-Gaussian Sparse GLMs
Authors:
Chang Cui,
**zhu Jia,
Yijun Xiao,
Huiming Zhang
Abstract:
High-dimensional sparse generalized linear models (GLMs) have emerged in the setting that the number of samples and the dimension of variables are large, and even the dimension of variables grows faster than the number of samples. False discovery rate (FDR) control aims to identify some small number of statistically significantly nonzero results after getting the sparse penalized estimation of GLM…
▽ More
High-dimensional sparse generalized linear models (GLMs) have emerged in the setting that the number of samples and the dimension of variables are large, and even the dimension of variables grows faster than the number of samples. False discovery rate (FDR) control aims to identify some small number of statistically significantly nonzero results after getting the sparse penalized estimation of GLMs. Using the CLIME method for precision matrix estimations, we construct the debiased-Lasso estimator and prove the asymptotical normality by minimax-rate oracle inequalities for sparse GLMs. In practice, it is often needed to accurately judge each regression coefficient's positivity and negativity, which determines whether the predictor variable is positively or negatively related to the response variable conditionally on the rest variables. Using the debiased estimator, we establish multiple testing procedures. Under mild conditions, we show that the proposed debiased statistics can asymptotically control the directional (sign) FDR and directional false discovery variables at a pre-specified significance level. Moreover, it can be shown that our multiple testing procedure can approximately achieve a statistical power of 1. We also extend our methods to the two-sample problems and propose the two-sample test statistics. Under suitable conditions, we can asymptotically achieve directional FDR control and directional FDV control at the specified significance level for two-sample problems. Some numerical simulations have successfully verified the FDR control effects of our proposed testing procedures, which sometimes outperforms the classical knockoff method.
△ Less
Submitted 2 May, 2021;
originally announced May 2021.
-
Spirit Distillation: A Model Compression Method with Multi-domain Knowledge Transfer
Authors:
Zhiyuan Wu,
Yu Jiang,
Minghao Zhao,
Chupeng Cui,
Zongmin Yang,
Xinhui Xue,
Hong Qi
Abstract:
Recent applications pose requirements of both cross-domain knowledge transfer and model compression to machine learning models due to insufficient training data and limited computational resources. In this paper, we propose a new knowledge distillation model, named Spirit Distillation (SD), which is a model compression method with multi-domain knowledge transfer. The compact student network mimics…
▽ More
Recent applications pose requirements of both cross-domain knowledge transfer and model compression to machine learning models due to insufficient training data and limited computational resources. In this paper, we propose a new knowledge distillation model, named Spirit Distillation (SD), which is a model compression method with multi-domain knowledge transfer. The compact student network mimics out a representation equivalent to the front part of the teacher network, through which the general knowledge can be transferred from the source domain (teacher) to the target domain (student). To further improve the robustness of the student, we extend SD to Enhanced Spirit Distillation (ESD) in exploiting a more comprehensive knowledge by introducing the proximity domain which is similar to the target domain for feature extraction. Results demonstrate that our method can boost mIOU and high-precision accuracy by 1.4% and 8.2% respectively with 78.2% segmentation variance, and can gain a precise compact network with only 41.8% FLOPs.
△ Less
Submitted 29 April, 2021;
originally announced April 2021.
-
AlphaEvolve: A Learning Framework to Discover Novel Alphas in Quantitative Investment
Authors:
Can Cui,
Wei Wang,
Meihui Zhang,
Gang Chen,
Zhao**g Luo,
Beng Chin Ooi
Abstract:
Alphas are stock prediction models capturing trading signals in a stock market. A set of effective alphas can generate weakly correlated high returns to diversify the risk. Existing alphas can be categorized into two classes: Formulaic alphas are simple algebraic expressions of scalar features, and thus can generalize well and be mined into a weakly correlated set. Machine learning alphas are data…
▽ More
Alphas are stock prediction models capturing trading signals in a stock market. A set of effective alphas can generate weakly correlated high returns to diversify the risk. Existing alphas can be categorized into two classes: Formulaic alphas are simple algebraic expressions of scalar features, and thus can generalize well and be mined into a weakly correlated set. Machine learning alphas are data-driven models over vector and matrix features. They are more predictive than formulaic alphas, but are too complex to mine into a weakly correlated set. In this paper, we introduce a new class of alphas to model scalar, vector, and matrix features which possess the strengths of these two existing classes. The new alphas predict returns with high accuracy and can be mined into a weakly correlated set. In addition, we propose a novel alpha mining framework based on AutoML, called AlphaEvolve, to generate the new alphas. To this end, we first propose operators for generating the new alphas and selectively injecting relational domain knowledge to model the relations between stocks. We then accelerate the alpha mining by proposing a pruning technique for redundant alphas. Experiments show that AlphaEvolve can evolve initial alphas into the new alphas with high returns and weak correlations.
△ Less
Submitted 1 April, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
Spirit Distillation: Precise Real-time Semantic Segmentation of Road Scenes with Insufficient Data
Authors:
Zhiyuan Wu,
Yu Jiang,
Chupeng Cui,
Zongmin Yang,
Xinhui Xue,
Hong Qi
Abstract:
Semantic segmentation of road scenes is one of the key technologies for realizing autonomous driving scene perception, and the effectiveness of deep Convolutional Neural Networks(CNNs) for this task has been demonstrated. State-of-art CNNs for semantic segmentation suffer from excessive computations as well as large-scale training data requirement. Inspired by the ideas of Fine-tuning-based Transf…
▽ More
Semantic segmentation of road scenes is one of the key technologies for realizing autonomous driving scene perception, and the effectiveness of deep Convolutional Neural Networks(CNNs) for this task has been demonstrated. State-of-art CNNs for semantic segmentation suffer from excessive computations as well as large-scale training data requirement. Inspired by the ideas of Fine-tuning-based Transfer Learning (FTT) and feature-based knowledge distillation, we propose a new knowledge distillation method for cross-domain knowledge transference and efficient data-insufficient network training, named Spirit Distillation(SD), which allow the student network to mimic the teacher network to extract general features, so that a compact and accurate student network can be trained for real-time semantic segmentation of road scenes. Then, in order to further alleviate the trouble of insufficient data and improve the robustness of the student, an Enhanced Spirit Distillation (ESD) method is proposed, which commits to exploit a more comprehensive general features extraction capability by considering images from both the target and the proximity domains as input. To our knowledge, this paper is a pioneering work on the application of knowledge distillation to few-shot learning. Persuasive experiments conducted on Cityscapes semantic segmentation with the prior knowledge transferred from COCO2017 and KITTI demonstrate that our methods can train a better student network (mIOU and high-precision accuracy boost by 1.4% and 8.2% respectively, with 78.2% segmentation variance) with only 41.8% FLOPs (see Fig. 1).
△ Less
Submitted 16 April, 2021; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Conceptual Text Region Network: Cognition-Inspired Accurate Scene Text Detection
Authors:
Chenwei Cui,
Liangfu Lu,
Zhiyuan Tan,
Amir Hussain
Abstract:
Segmentation-based methods are widely used for scene text detection due to their superiority in describing arbitrary-shaped text instances. However, two major problems still exist: 1) current label generation techniques are mostly empirical and lack theoretical support, discouraging elaborate label design; 2) as a result, most methods rely heavily on text kernel segmentation which is unstable and…
▽ More
Segmentation-based methods are widely used for scene text detection due to their superiority in describing arbitrary-shaped text instances. However, two major problems still exist: 1) current label generation techniques are mostly empirical and lack theoretical support, discouraging elaborate label design; 2) as a result, most methods rely heavily on text kernel segmentation which is unstable and requires deliberate tuning. To address these challenges, we propose a human cognition-inspired framework, termed, Conceptual Text Region Network (CTRNet). The framework utilizes Conceptual Text Regions (CTRs), which is a class of cognition-based tools inheriting good mathematical properties, allowing for sophisticated label design. Another component of CTRNet is an inference pipeline that, with the help of CTRs, completely omits the need for text kernel segmentation. Compared with previous segmentation-based methods, our approach is not only more interpretable but also more accurate. Experimental results show that CTRNet achieves state-of-the-art performance on benchmark CTW1500, Total-Text, MSRA-TD500, and ICDAR 2015 datasets, yielding performance gains of up to 2.0%. Notably, to the best of our knowledge, CTRNet is among the first detection models to achieve F-measures higher than 85.0% on all four of the benchmarks, with remarkable consistency and stability.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones
Authors:
Cheng Cui,
Ruoyu Guo,
Yuning Du,
Dongliang He,
Fu Li,
Zewu Wu,
Qiwen Liu,
Shilei Wen,
Jizhou Huang,
Xiaoguang Hu,
Dianhai Yu,
Errui Ding,
Yanjun Ma
Abstract:
Recently, research efforts have been concentrated on revealing how pre-trained model makes a difference in neural network performance. Self-supervision and semi-supervised learning technologies have been extensively explored by the community and are proven to be of great potential in obtaining a powerful pre-trained model. However, these models require huge training costs (i.e., hundreds of millio…
▽ More
Recently, research efforts have been concentrated on revealing how pre-trained model makes a difference in neural network performance. Self-supervision and semi-supervised learning technologies have been extensively explored by the community and are proven to be of great potential in obtaining a powerful pre-trained model. However, these models require huge training costs (i.e., hundreds of millions of images or training iterations). In this paper, we propose to improve existing baseline networks via knowledge distillation from off-the-shelf pre-trained big powerful models. Different from existing knowledge distillation frameworks which require student model to be consistent with both soft-label generated by teacher model and hard-label annotated by humans, our solution performs distillation by only driving prediction of the student model consistent with that of the teacher model. Therefore, our distillation setting can get rid of manually labeled data and can be trained with extra unlabeled data to fully exploit capability of teacher model for better learning. We empirically find that such simple distillation settings perform extremely effective, for example, the top-1 accuracy on ImageNet-1k validation set of MobileNetV3-large and ResNet50-D can be significantly improved from 75.2% to 79% and 79.1% to 83%, respectively. We have also thoroughly analyzed what are dominant factors that affect the distillation performance and how they make a difference. Extensive downstream computer vision tasks, including transfer learning, object detection and semantic segmentation, can significantly benefit from the distilled pretrained models. All our experiments are implemented based on PaddlePaddle, codes and a series of improved pretrained models with ssld suffix are available in PaddleClas.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Skyrmion Logic Clocked via Voltage Controlled Magnetic Anisotropy
Authors:
Benjamin W. Walker,
Can Cui,
Felipe Garcia-Sanchez,
Jean Anne C. Incorvia,
Xuan Hu,
Joseph S. Friedman
Abstract:
Magnetic skyrmions are exciting candidates for energy-efficient computing due to their non-volatility, detectability,and mobility. A recent proposal within the paradigm of reversible computing enables large-scale circuits composed ofdirectly-cascaded skyrmion logic gates, but it is limited by the manufacturing difficulty and energy costs associated withthe use of notches for skyrmion synchronizati…
▽ More
Magnetic skyrmions are exciting candidates for energy-efficient computing due to their non-volatility, detectability,and mobility. A recent proposal within the paradigm of reversible computing enables large-scale circuits composed ofdirectly-cascaded skyrmion logic gates, but it is limited by the manufacturing difficulty and energy costs associated withthe use of notches for skyrmion synchronization. To overcome these challenges, we therefore propose a skyrmion logicsynchronized via modulation of voltage-controlled magnetic anisotropy (VCMA). In addition to demonstrating theprinciple of VCMA synchronization through micromagnetic simulations, we also quantify the impacts of current den-sity, skyrmion velocity, and anisotropy barrier height on skyrmion motion. Further micromagnetic results demonstratethe feasibility of cascaded logic circuits in which VCMA synchronizers enable clocking and pipelining, illustrating afeasible pathway toward energy-efficient large-scale computing systems based on magnetic skyrmions.
△ Less
Submitted 5 March, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
GraphAttacker: A General Multi-Task GraphAttack Framework
Authors:
**yin Chen,
Dunjie Zhang,
Zhaoyan Ming,
Kejie Huang,
Wenrong Jiang,
Chen Cui
Abstract:
Graph neural networks (GNNs) have been successfully exploited in graph analysis tasks in many real-world applications. The competition between attack and defense methods also enhances the robustness of GNNs. In this competition, the development of adversarial training methods put forward higher requirement for the diversity of attack examples. By contrast, most attack methods with specific attack…
▽ More
Graph neural networks (GNNs) have been successfully exploited in graph analysis tasks in many real-world applications. The competition between attack and defense methods also enhances the robustness of GNNs. In this competition, the development of adversarial training methods put forward higher requirement for the diversity of attack examples. By contrast, most attack methods with specific attack strategies are difficult to satisfy such a requirement. To address this problem, we propose GraphAttacker, a novel generic graph attack framework that can flexibly adjust the structures and the attack strategies according to the graph analysis tasks. GraphAttacker generates adversarial examples through alternate training on three key components: the multi-strategy attack generator (MAG), the similarity discriminator (SD), and the attack discriminator (AD), based on the generative adversarial network (GAN). Furthermore, we introduce a novel similarity modification rate SMR to conduct a stealthier attack considering the change of node similarity distribution. Experiments on various benchmark datasets demonstrate that GraphAttacker can achieve state-of-the-art attack performance on graph analysis tasks of node classification, graph classification, and link prediction, no matter the adversarial training is conducted or not. Moreover, we also analyze the unique characteristics of each task and their specific response in the unified attack framework. The project code is available at https://github.com/honoluluuuu/GraphAttacker.
△ Less
Submitted 9 November, 2021; v1 submitted 17 January, 2021;
originally announced January 2021.
-
Asynchronous Multi-View SLAM
Authors:
Anqi Joyce Yang,
Can Cui,
Ioan Andrei Bârsan,
Raquel Urtasun,
Shenlong Wang
Abstract:
Existing multi-camera SLAM systems assume synchronized shutters for all cameras, which is often not the case in practice. In this work, we propose a generalized multi-camera SLAM formulation which accounts for asynchronous sensor observations. Our framework integrates a continuous-time motion model to relate information across asynchronous multi-frames during tracking, local map**, and loop clos…
▽ More
Existing multi-camera SLAM systems assume synchronized shutters for all cameras, which is often not the case in practice. In this work, we propose a generalized multi-camera SLAM formulation which accounts for asynchronous sensor observations. Our framework integrates a continuous-time motion model to relate information across asynchronous multi-frames during tracking, local map**, and loop closing. For evaluation, we collected AMV-Bench, a challenging new SLAM dataset covering 482 km of driving recorded using our asynchronous multi-camera robotic platform. AMV-Bench is over an order of magnitude larger than previous multi-view HD outdoor SLAM datasets, and covers diverse and challenging motions and environments. Our experiments emphasize the necessity of asynchronous sensor modeling, and show that the use of multiple cameras is critical towards robust and accurate SLAM in challenging outdoor scenes. For additional information, please see the project website at: https://www.cs.toronto.edu/~ajyang/amv-slam
△ Less
Submitted 14 July, 2021; v1 submitted 16 January, 2021;
originally announced January 2021.
-
Double Dirac Nodal Line Semimetal with Torus Surface State
Authors:
Xiao-** Li,
Botao Fu,
Da-Shuai Ma,
Chaoxi Cui,
Zhi-Ming Yu,
Yugui Yao
Abstract:
We propose a class of nodal line semimetals that host an eight-fold degenerate double Dirac nodal line (DDNL) with negligible spin-orbit coupling. We find only 5 of the 230 space groups host the DDNL. The DDNL can be considered as a combination of two Dirac nodal lines, and has a trivial Berry phase. This leads to two possible but completely different surface states, namely, a torus surface state…
▽ More
We propose a class of nodal line semimetals that host an eight-fold degenerate double Dirac nodal line (DDNL) with negligible spin-orbit coupling. We find only 5 of the 230 space groups host the DDNL. The DDNL can be considered as a combination of two Dirac nodal lines, and has a trivial Berry phase. This leads to two possible but completely different surface states, namely, a torus surface state covering the whole surface Brillouin zone and no surface state at all. Based on first-principles calculations, we predict that the hydrogen storage material LiBH is an ideal DDNL semimetal, where the line resides at Fermi level, is relatively flat in energy, and exhibits a large linear energy range. Interestingly, both the two novel surface states of DDNL can be realized in LiBH. Further, we predict that with a magnetic field parallel to DDNL, the Landau levels of DDNL are doubly degenerate due to Kramers-like degeneracy and have a doubly degenerate zero-mode.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
$\bar{D}^{(*)}_{s}D^{(*)}$ molecular state with $J^{P}=1^{+}$
Authors:
Yong-Jiang Xu,
Yong-Lu Liu,
Chun-Yu Cui,
Ming-Qiu Huang
Abstract:
In this paper, we construct $\bar{D}^{(*)}_{s}D^{(*)}$-molecule-type interpolating currents $J_{(\pm)μ}(x)$ with $J^{P}=1^{+}$, calculate the corresponding mass and magnetic moment using the QCD sum rule method and its extension in the weak electromagnetic field, and study the processes of $Z_{(\pm)cs}$ to $η_{c}K^{*}$, $J/ψK$, $\bar{D}D^{*}_{s}$, and $\bar{D}^{*}D_{s}$ via three-point sum rules.…
▽ More
In this paper, we construct $\bar{D}^{(*)}_{s}D^{(*)}$-molecule-type interpolating currents $J_{(\pm)μ}(x)$ with $J^{P}=1^{+}$, calculate the corresponding mass and magnetic moment using the QCD sum rule method and its extension in the weak electromagnetic field, and study the processes of $Z_{(\pm)cs}$ to $η_{c}K^{*}$, $J/ψK$, $\bar{D}D^{*}_{s}$, and $\bar{D}^{*}D_{s}$ via three-point sum rules. The numerical values are $m_{Z_{(\pm)cs}}=3.99^{+0.17}_{-0.14}~\mbox{GeV}$, and $λ_{Z_{(\pm)cs}}=2.07^{+0.28}_{-0.16}\times10^{-2}~\mbox{GeV}^{5}$, $μ_{Z_{(\pm)cs}}=0.18^{+0.16}_{-0.09}~μ_{N}$ with $μ_{N}$ the nucleon magneton, $Γ_{Z_{(+)cs}}=17.47^{+12.70}_{-8.08}$, and $Γ_{Z_{(-)cs}}=13.86^{+10.37}_{-6.51}$. The masses are in agreement with the recently measured value of $Z_{cs}(3985)$ by the BESIII Collaboration, $m^{exp}_{Z_{cs}}=(3982.5^{+1.8}_{-2.6}\pm2.1)~\mbox{MeV}$. The widths are compatible with the experimental value, $Γ^{exp}_{Z_{cs}}=(12.8^{+5.3}_{-4.4}\pm3.0)~\mbox{MeV}$. The magnetic moment and the various decay modes can help us to determine the inner structure of $Z_{cs}(3985)$ when being confronted with experimental data in the future.
△ Less
Submitted 20 November, 2021; v1 submitted 29 November, 2020;
originally announced November 2020.
-
A phase field formulation for dissolution-driven stress corrosion cracking
Authors:
Chuanjie Cui,
Ru** Ma,
Emilio Martínez-Pañeda
Abstract:
We present a new theoretical and numerical framework for modelling mechanically-assisted corrosion in elastic-plastic solids. Both pitting and stress corrosion cracking (SCC) can be captured, as well as the pit-to-crack transition. Localised corrosion is assumed to be dissolution-driven and a formulation grounded upon the film rupture-dissolution-repassivation mechanism is presented to incorporate…
▽ More
We present a new theoretical and numerical framework for modelling mechanically-assisted corrosion in elastic-plastic solids. Both pitting and stress corrosion cracking (SCC) can be captured, as well as the pit-to-crack transition. Localised corrosion is assumed to be dissolution-driven and a formulation grounded upon the film rupture-dissolution-repassivation mechanism is presented to incorporate the influence of film passivation. The model incorporates, for the first time, the role of mechanical straining as the electrochemical driving force, accelerating corrosion kinetics. The computational complexities associated with tracking the evolving metal-electrolyte interface are resolved by making use of a phase field paradigm, enabling an accurate approximation of complex SCC morphologies. The coupled electro-chemo-mechanical formulation is numerically implemented using the finite element method and an implicit time integration scheme; displacements, phase field order parameter and concentration are the primary variables. Five case studies of particular interest are addressed to showcase the predictive capabilities of the model, revealing an excellent agreement with analytical solutions and experimental measurements. By modelling these paradigmatic 2D and 3D boundary value problems we show that our formulation can capture: (i) the transition from activation-controlled corrosion to diffusion-controlled corrosion, (ii) the sensitivity of interface kinetics to mechanical stresses and strains, (iii) the role of film passivation in reducing corrosion rates, and (iv) the dependence of the stability of the passive film to local strain rates. The influence of these factors in driving the shape change of SCC defects, including the pit-to-crack transition, is a natural outcome of the model, laying the foundations for a mechanistic assessment of engineering materials and structures.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Domain Wall Leaky Integrate-and-Fire Neurons with Shape-Based Configurable Activation Functions
Authors:
Wesley H. Brigner,
Naimul Hassan,
Xuan Hu,
Christopher H. Bennett,
Felipe Garcia-Sanchez,
Can Cui,
Alvaro Velasquez,
Matthew J. Marinella,
Jean Anne C. Incorvia,
Joseph S. Friedman
Abstract:
Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence ap…
▽ More
Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence applications. However, a large quantity of these artificial neuromorphic devices still require the use of CMOS, which decreases the efficiency of the system. To resolve this, we have previously proposed a number of artificial neurons and synapses that do not require CMOS for operation. Although these devices are a significant improvement over previous renditions, their ability to enable neural network learning and recognition is limited by their intrinsic activation functions. This work proposes modifications to these spintronic neurons that enable configuration of the activation functions through control of the shape of a magnetic domain wall track. Linear and sigmoidal activation functions are demonstrated in this work, which can be extended through a similar approach to enable a wide variety of activation functions.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Uncertainty Estimation in Medical Image Localization: Towards Robust Anterior Thalamus Targeting for Deep Brain Stimulation
Authors:
Han Liu,
Can Cui,
Dario J. Englot,
Benoit M. Dawant
Abstract:
Atlas-based methods are the standard approaches for automatic targeting of the Anterior Nucleus of the Thalamus (ANT) for Deep Brain Stimulation (DBS), but these are known to lack robustness when anatomic differences between atlases and subjects are large. To improve the localization robustness, we propose a novel two-stage deep learning (DL) framework, where the first stage identifies and crops t…
▽ More
Atlas-based methods are the standard approaches for automatic targeting of the Anterior Nucleus of the Thalamus (ANT) for Deep Brain Stimulation (DBS), but these are known to lack robustness when anatomic differences between atlases and subjects are large. To improve the localization robustness, we propose a novel two-stage deep learning (DL) framework, where the first stage identifies and crops the thalamus regions from the whole brain MRI and the second stage performs per-voxel regression on the cropped volume to localize the targets at the finest resolution scale. To address the issue of data scarcity, we train the models with the pseudo labels which are created based on the available labeled data using multi-atlas registration. To assess the performance of the proposed framework, we validate two sampling-based uncertainty estimation techniques namely Monte Carlo Dropout (MCDO) and Test-Time Augmentation (TTA) on the second-stage localization network. Moreover, we propose a novel uncertainty estimation metric called maximum activation dispersion (MAD) to estimate the image-wise uncertainty for localization tasks. Our results show that the proposed method achieved more robust localization performance than the traditional multi-atlas method and TTA could further improve the robustness. Moreover, the epistemic and hybrid uncertainty estimated by MAD could be used to detect the unreliable localizations and the magnitude of the uncertainty estimated by MAD could reflect the degree of unreliability for the rejected predictions.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
A Novel Semi-Supervised Data-Driven Method for Chiller Fault Diagnosis with Unlabeled Data
Authors:
Bingxu Li,
Fanyong Cheng,
Xin Zhang,
Can Cui,
Wenjian Cai
Abstract:
In practical chiller systems, applying efficient fault diagnosis techniques can significantly reduce energy consumption and improve energy efficiency of buildings. The success of the existing methods for fault diagnosis of chillers relies on the condition that sufficient labeled data are available for training. However, label acquisition is laborious and costly in practice. Usually, the number of…
▽ More
In practical chiller systems, applying efficient fault diagnosis techniques can significantly reduce energy consumption and improve energy efficiency of buildings. The success of the existing methods for fault diagnosis of chillers relies on the condition that sufficient labeled data are available for training. However, label acquisition is laborious and costly in practice. Usually, the number of labeled data is limited and most data available are unlabeled. The existing methods cannot exploit the information contained in unlabeled data, which significantly limits the improvement of fault diagnosis performance in chiller systems. To make effective use of unlabeled data to further improve fault diagnosis performance and reduce the dependency on labeled data, we proposed a novel semi-supervised data-driven fault diagnosis method for chiller systems based on the semi-generative adversarial network, which incorporates both unlabeled and labeled data into learning process. The semi-generative adversarial network can learn the information of data distribution from unlabeled data and this information can help to significantly improve the diagnostic performance. Experimental results demonstrate the effectiveness of the proposed method. Under the scenario that there are only 80 labeled samples and 16000 unlabeled samples, the proposed method can improve the diagnostic accuracy to 84%, while the supervised baseline methods only reach the accuracy of 65% at most. Besides, the minimal required number of labeled samples can be reduced by about 60% with the proposed method when there are enough unlabeled samples.
△ Less
Submitted 31 October, 2020;
originally announced November 2020.
-
Adaptive cognition implemented with a context-aware and flexible neuron for next-generation artificial intelligence
Authors:
Priyamvada Jadaun,
Can Cui,
Sam Liu,
Jean Anne C. Incorvia
Abstract:
Neuromorphic computing mimics the organizational principles of the brain in its quest to replicate the brain's intellectual abilities. An impressive ability of the brain is its adaptive intelligence, which allows the brain to regulate its functions "on the fly" to cope with myriad and ever-changing situations. In particular, the brain displays three adaptive and advanced intelligence abilities of…
▽ More
Neuromorphic computing mimics the organizational principles of the brain in its quest to replicate the brain's intellectual abilities. An impressive ability of the brain is its adaptive intelligence, which allows the brain to regulate its functions "on the fly" to cope with myriad and ever-changing situations. In particular, the brain displays three adaptive and advanced intelligence abilities of context-awareness, cross frequency coupling and feature binding. To mimic these adaptive cognitive abilities, we design and simulate a novel, hardware-based adaptive oscillatory neuron using a lattice of magnetic skyrmions. Charge current fed to the neuron reconfigures the skyrmion lattice, thereby modulating the neuron's state, its dynamics and its transfer function "on the fly". This adaptive neuron is used to demonstrate the three cognitive abilities, of which context-awareness and cross-frequency coupling have not been previously realized in hardware neurons. Additionally, the neuron is used to construct an adaptive artificial neural network (ANN) and perform context-aware diagnosis of breast cancer. Simulations show that the adaptive ANN diagnoses cancer with higher accuracy while learning faster from smaller amounts of data and using a more compact and energy-efficient network than the state-of-the-art non-adaptive ANNs used for cancer diagnosis. The work further describes how hardware-based adaptive neurons can mitigate several critical challenges facing contemporary ANNs. Modern ANNs require large amounts of training data, energy and chip area and are highly task-specific; conversely, hardware-based ANNs built with adaptive neurons show faster learning from smaller datasets, compact architectures, energy-efficiency, fault-tolerance and can lead to the realization of general artificial intelligence.
△ Less
Submitted 16 October, 2021; v1 submitted 29 October, 2020;
originally announced October 2020.
-
Domain Wall-Magnetic Tunnel Junction Spin Orbit Torque Devices and Circuits for In-Memory Computing
Authors:
Mahshid Alamdar,
Thomas Leonard,
Can Cui,
Bishweshwor P. Rimal,
Lin Xue,
Otitoaleke G. Akinola,
T. Patrick Xiao,
Joseph S. Friedman,
Christopher H. Bennett,
Matthew J. Marinella,
Jean Anne C. Incorvia
Abstract:
There are pressing problems with traditional computing, especially for accomplishing data-intensive and real-time tasks, that motivate the development of in-memory computing devices to both store information and perform computation. Magnetic tunnel junction (MTJ) memory elements can be used for computation by manipulating a domain wall (DW), a transition region between magnetic domains. But, these…
▽ More
There are pressing problems with traditional computing, especially for accomplishing data-intensive and real-time tasks, that motivate the development of in-memory computing devices to both store information and perform computation. Magnetic tunnel junction (MTJ) memory elements can be used for computation by manipulating a domain wall (DW), a transition region between magnetic domains. But, these devices have suffered from challenges: spin transfer torque (STT) switching of a DW requires high current, and the multiple etch steps needed to create an MTJ pillar on top of a DW track has led to reduced tunnel magnetoresistance (TMR). These issues have limited experimental study of devices and circuits. Here, we study prototypes of three-terminal domain wall-magnetic tunnel junction (DW-MTJ) in-memory computing devices that can address data processing bottlenecks and resolve these challenges by using perpendicular magnetic anisotropy (PMA), spin-orbit torque (SOT) switching, and an optimized lithography process to produce average device tunnel magnetoresistance TMR = 164%, resistance-area product RA = 31 Ω-μm^2, close to the RA of the unpatterned film, and lower switching current density compared to using spin transfer torque. A two-device circuit shows bit propagation between devices. Device initialization variation in switching voltage is shown to be curtailed to 7% by controlling the DW initial position, which we show corresponds to 96% accuracy in a DW-MTJ full adder simulation. These results make strides in using MTJs and DWs for in-memory and neuromorphic computing applications.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
Activation Map Adaptation for Effective Knowledge Distillation
Authors:
Zhiyuan Wu,
Hong Qi,
Yu Jiang,
Minghao Zhao,
Chupeng Cui,
Zongmin Yang,
Xinhui Xue
Abstract:
Model compression becomes a recent trend due to the requirement of deploying neural networks on embedded and mobile devices. Hence, both accuracy and efficiency are of critical importance. To explore a balance between them, a knowledge distillation strategy is proposed for general visual representation learning. It utilizes our well-designed activation map adaptive module to replace some blocks of…
▽ More
Model compression becomes a recent trend due to the requirement of deploying neural networks on embedded and mobile devices. Hence, both accuracy and efficiency are of critical importance. To explore a balance between them, a knowledge distillation strategy is proposed for general visual representation learning. It utilizes our well-designed activation map adaptive module to replace some blocks of the teacher network, exploring the most appropriate supervisory features adaptively during the training process. Using the teacher's hidden layer output to prompt the student network to train so as to transfer effective semantic information.To verify the effectiveness of our strategy, this paper applied our method to cifar-10 dataset. Results demonstrate that the method can boost the accuracy of the student network by 0.6% with 6.5% loss reduction, and significantly improve its training speed.
△ Less
Submitted 14 April, 2022; v1 submitted 26 October, 2020;
originally announced October 2020.
-
MLANE: Meta-Learning Based Adaptive Network Embedding
Authors:
Chen Cui,
Ning Yang,
Philip S. Yu
Abstract:
Most existing random walk based network embedding methods often follow only one of two principles, homophily or structural equivalence. In real world networks, however, nodes exhibit a mixture of homophily and structural equivalence, which requires adaptive network embedding that can adaptively preserve both homophily and structural equivalence for different nodes in different down-stream analysis…
▽ More
Most existing random walk based network embedding methods often follow only one of two principles, homophily or structural equivalence. In real world networks, however, nodes exhibit a mixture of homophily and structural equivalence, which requires adaptive network embedding that can adaptively preserve both homophily and structural equivalence for different nodes in different down-stream analysis tasks. In this paper, we propose a novel method called Meta-Learning based Adaptive Network Embedding (MLANE), which can learn adaptive sampling strategy for different nodes in different tasks by incorporating sampling strategy learning with embedding learning into one optimization problem that can be solved via an end-to-end meta-learning framework. In extensive experiments on real datasets, MLANE shows significant performance improvements over the baselines. The source code of MLANE and the datasets used in experiments and all the hyperparameter settings for baselines are available at https://github.com/7733com/MLANE.
△ Less
Submitted 24 October, 2020;
originally announced October 2020.
-
HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network
Authors:
Pengcheng Yuan,
Shufei Lin,
Cheng Cui,
Yuning Du,
Ruoyu Guo,
Dongliang He,
Errui Ding,
Shumin Han
Abstract:
This paper addresses representational block named Hierarchical-Split Block, which can be taken as a plug-and-play block to upgrade existing convolutional neural networks, improves model performance significantly in a network. Hierarchical-Split Block contains many hierarchical split and concatenate connections within one single residual block. We find multi-scale features is of great importance fo…
▽ More
This paper addresses representational block named Hierarchical-Split Block, which can be taken as a plug-and-play block to upgrade existing convolutional neural networks, improves model performance significantly in a network. Hierarchical-Split Block contains many hierarchical split and concatenate connections within one single residual block. We find multi-scale features is of great importance for numerous vision tasks. Moreover, Hierarchical-Split block is very flexible and efficient, which provides a large space of potential network architectures for different applications. In this work, we present a common backbone based on Hierarchical-Split block for tasks: image classification, object detection, instance segmentation and semantic image segmentation/parsing. Our approach shows significant improvements over all these core tasks in comparison with the baseline. As shown in Figure1, for image classification, our 50-layers network(HS-ResNet50) achieves 81.28% top-1 accuracy with competitive latency on ImageNet-1k dataset. It also outperforms most state-of-the-art models. The source code and models will be available on: https://github.com/PaddlePaddle/PaddleClas
△ Less
Submitted 15 October, 2020;
originally announced October 2020.
-
GWOPS: A VO-technology Driven Tool to Search for the Electromagnetic Counterpart of Gravitational Wave Event
Authors:
Yunfei Xu,
Dong Xu,
Chenzhou Cui,
Dongwei Fan,
Zipei Zhu,
Bangyao Yu,
Changhua Li,
Jun Han,
Linying Mi,
Shanshan Li,
Boliang He,
Yihan Tao,
Hanxi Yang,
Sisi Yang
Abstract:
The search and follow-up observation of electromagnetic (EM) counterparts of gravitational waves (GW) is a current hot topic of GW cosmology. Due to the limitation of the accuracy of the GW observation facility at this stage, we can only get a rough sky-localization region for the GW event, and the typical area of the region is between 200 and 1500 square degrees. Since GW events occur in or near…
▽ More
The search and follow-up observation of electromagnetic (EM) counterparts of gravitational waves (GW) is a current hot topic of GW cosmology. Due to the limitation of the accuracy of the GW observation facility at this stage, we can only get a rough sky-localization region for the GW event, and the typical area of the region is between 200 and 1500 square degrees. Since GW events occur in or near galaxies, limiting the observation target to galaxies can significantly speedup searching for EM counterparts. Therefore, how to efficiently select host galaxy candidates in such a large GW localization region, how to arrange the observation sequence, and how to efficiently identify the GW source from observational data are the problems that need to be solved. International Virtual Observatory Alliance has developed a series of technical standards for data retrieval, interoperability and visualization. Based on the application of VO technologies, we construct the GW follow-up Observation Planning System (GWOPS). It consists of three parts: a pipeline to select host candidates of GW and sort their priorities for follow-up observation, an identification module to find the transient from follow-up observation data, and a visualization module to display GW-related data. GWOPS can rapidly respond to GW events. With GWOPS, the operations such as follow-up observation planning, data storage, data visualization, and transient identification can be efficiently coordinated, which will promote the success searching rate for GWs EM counterparts.
△ Less
Submitted 9 September, 2020; v1 submitted 7 September, 2020;
originally announced September 2020.
-
Signature of band inversion in the antiferromagnetic phase of axion insulator candidate EuIn2As2
Authors:
Takafumi Sato,
Zhiwei Wang,
Daichi Takane,
Seigo Souma,
Chaoxi Cui,
Yongkai Li,
Kosuke Nakayama,
Tappei Kawakami,
Yuya Kubota,
Cephise Cacho,
Timur K. Kim,
Arian Arab,
Vladimir N. Strocov,
Yugui Yao,
Takashi Takahashi
Abstract:
We have performed angle-resolved photoemission spectroscopy on EuIn2As2 which is predicted to be an axion insulator in the antiferromagnetic state. By utilizing soft-x-ray and vacuum-ultraviolet photons, we revealed a three-dimensional hole pocket centered at the Gamma point of bulk Brillouin zone together with a heavily hole-doped surface state in the paramagnetic phase. Upon entering the antifer…
▽ More
We have performed angle-resolved photoemission spectroscopy on EuIn2As2 which is predicted to be an axion insulator in the antiferromagnetic state. By utilizing soft-x-ray and vacuum-ultraviolet photons, we revealed a three-dimensional hole pocket centered at the Gamma point of bulk Brillouin zone together with a heavily hole-doped surface state in the paramagnetic phase. Upon entering the antiferromagnetic phase, the band structure exhibits a marked reconstruction characterized by the emergence of a "M"-shaped bulk band near the Fermi level. The qualitative agreement with first-principles band-structure calculations suggests the occurrence of bulk-band inversion at the Gamma point in the antiferromagnetic phase. We suggest that EuIn2As2 provides a good opportunity to study the exotic quantum phases associated with possible axion-insulator phase.
△ Less
Submitted 2 September, 2020;
originally announced September 2020.
-
Dynamic Attitude Estimation Improvement for Low-cost MEMS IMU by Integrating Low-cost GPS
Authors:
Guiqiu Liao,
Jiankang Zhao,
Chao Cui,
Haihui Long,
Jianbin Zhu,
Achraf Djerida
Abstract:
This paper proposes a low-cost six Degree-of-Freedom (6-DOF) navigation system for small aerial robots based on the integration of Global Position System (GPS) receiver with sensors of inertional Microelectromechanical Systems (MEMS). In the problem of fusing Inertial Measurement Unit (IMU) with low-cost GPS, the effect of time synchronization error on attitude estimation is concerned. A fusion al…
▽ More
This paper proposes a low-cost six Degree-of-Freedom (6-DOF) navigation system for small aerial robots based on the integration of Global Position System (GPS) receiver with sensors of inertional Microelectromechanical Systems (MEMS). In the problem of fusing Inertial Measurement Unit (IMU) with low-cost GPS, the effect of time synchronization error on attitude estimation is concerned. A fusion algorithm which can estimate the motion states and the time synchronization error simultaneously is proposed. This algorithm adds a time estimation loop to improve estimation accuracy. Compared with another states augmented estimation approach, this method has the advantages of lower computation burden, avoidance of the discretization error in the low sample rate. The estimation algorithm is implemented in an low-cost embedded microprocessor where the update rate of algorithm can achieve more than 100 Hz, and therefore high-performance computational units are not necessary. In robotic experiment, the proposed algorithm serves as the navigation solution for a small aerial robot. The accuracy and reliability of the self-designed system are tested when the robot is moving with significant acceleration.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
Beyond Point Estimate: Inferring Ensemble Prediction Variation from Neuron Activation Strength in Recommender Systems
Authors:
Zhe Chen,
Yuyan Wang,
Dong Lin,
Derek Zhiyuan Cheng,
Lichan Hong,
Ed H. Chi,
Claire Cui
Abstract:
Despite deep neural network (DNN)'s impressive prediction performance in various domains, it is well known now that a set of DNN models trained with the same model specification and the same data can produce very different prediction results. Ensemble method is one state-of-the-art benchmark for prediction uncertainty estimation. However, ensembles are expensive to train and serve for web-scale tr…
▽ More
Despite deep neural network (DNN)'s impressive prediction performance in various domains, it is well known now that a set of DNN models trained with the same model specification and the same data can produce very different prediction results. Ensemble method is one state-of-the-art benchmark for prediction uncertainty estimation. However, ensembles are expensive to train and serve for web-scale traffic.
In this paper, we seek to advance the understanding of prediction variation estimated by the ensemble method. Through empirical experiments on two widely used benchmark datasets MovieLens and Criteo in recommender systems, we observe that prediction variations come from various randomness sources, including training data shuffling, and parameter random initialization. By introducing more randomness into model training, we notice that ensemble's mean predictions tend to be more accurate while the prediction variations tend to be higher. Moreover, we propose to infer prediction variation from neuron activation strength and demonstrate the strong prediction power from activation strength features. Our experiment results show that the average R squared on MovieLens is as high as 0.56 and on Criteo is 0.81. Our method performs especially well when detecting the lowest and highest variation buckets, with 0.92 AUC and 0.89 AUC respectively. Our approach provides a simple way for prediction variation estimation, which opens up new opportunities for future work in many interesting areas (e.g.,model-based reinforcement learning) without relying on serving expensive ensemble models.
△ Less
Submitted 16 August, 2020;
originally announced August 2020.
-
An Intelligent Control Strategy for buck DC-DC Converter via Deep Reinforcement Learning
Authors:
Chenggang Cui,
Nan Yan,
Chuanlin Zhang
Abstract:
As a typical switching power supply, the DC-DC converter has been widely applied in DC microgrid. Due to the variation of renewable energy generation, research and design of DC-DC converter control algorithm with outstanding dynamic characteristics has significant theoretical and practical application value. To mitigate the bus voltage stability issue in DC microgrid, an innovative intelligent con…
▽ More
As a typical switching power supply, the DC-DC converter has been widely applied in DC microgrid. Due to the variation of renewable energy generation, research and design of DC-DC converter control algorithm with outstanding dynamic characteristics has significant theoretical and practical application value. To mitigate the bus voltage stability issue in DC microgrid, an innovative intelligent control strategy for buck DC-DC converter with constant power loads (CPLs) via deep reinforcement learning algorithm is constructed for the first time. In this article, a Markov Decision Process (MDP) model and the deep Q network (DQN) algorithm are defined for DC-DC converter. A model-free based deep reinforcement learning (DRL) control strategy is appropriately designed to adjust the agent-environment interaction through the rewards/penalties mechanism towards achieving converge to nominal voltage. The agent makes approximate decisions by extracting the high-dimensional feature of complex power systems without any prior knowledge. Eventually, the simulation comparison results demonstrate that the proposed controller has stronger self-learning and self-optimization capabilities under the different scenarios.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
High-Purity Pulsed Squeezing Generation with Integrated Photonics
Authors:
Chaohan Cui,
Christos N. Gagatsos,
Saikat Guha,
Linran Fan
Abstract:
Squeezed light has evolved into a powerful tool for quantum technology, ranging from quantum enhanced sensing and quantum state engineering based on partial post-selection techniques. The pulsed generation of squeezed light is of particular interest, as it can provide accurate time stamp and physically defined temporal mode, which are highly preferred in complex communication networks and large-sc…
▽ More
Squeezed light has evolved into a powerful tool for quantum technology, ranging from quantum enhanced sensing and quantum state engineering based on partial post-selection techniques. The pulsed generation of squeezed light is of particular interest, as it can provide accurate time stamp and physically defined temporal mode, which are highly preferred in complex communication networks and large-scale information processing. However, the multimode feature of pulsed squeezing in conventional single-pass configuration limits the purity of the output state, negatively impacting its application in quantum technology. In this Letter, we propose a new approach to generate pulsed squeezing with high temporal purity. Pulsed squeezing based on parametric down-conversion in photonic cavities is analyzed. We show that the effective mode number of the output squeezed light approaches unity. Such a high-purity squeezed light can be realized with broad parameters and low pump power, providing a robust approach to generate large-scale quantum resource.
△ Less
Submitted 14 July, 2020;
originally announced July 2020.
-
High-dimensional Frequency-Encoded Quantum Information Processing with Passive Photonics and Time-Resolving Detection
Authors:
Chaohan Cui,
Kaushik P. Seshadreesan,
Saikat Guha,
Linran Fan
Abstract:
In this Letter, we propose a new approach to process high-dimensional quantum information encoded in a photon frequency domain. In contrast to previous approaches based on nonlinear optical processes, no active control of photon energy is required. Arbitrary unitary transformation and projection measurement can be realized with passive photonic circuits and time-resolving detection. A systematic c…
▽ More
In this Letter, we propose a new approach to process high-dimensional quantum information encoded in a photon frequency domain. In contrast to previous approaches based on nonlinear optical processes, no active control of photon energy is required. Arbitrary unitary transformation and projection measurement can be realized with passive photonic circuits and time-resolving detection. A systematic circuit design for a quantum frequency comb with arbitrary size has been given. The criteria to verify quantum frequency correlation has been derived. By considering the practical condition of detector's finite response time, we show that high-fidelity operation can be readily realized with current device performance. This work will pave the way towards scalable and high-fidelity quantum information processing based on high-dimensional frequency encoding.
△ Less
Submitted 14 July, 2020;
originally announced July 2020.
-
Semi-Supervised Recognition under a Noisy and Fine-grained Dataset
Authors:
Cheng Cui,
Zhi Ye,
Yangxi Li,
Xinjian Li,
Min Yang,
Kai Wei,
Bing Dai,
Yanmei Zhao,
Zhongji Liu,
Rong Pang
Abstract:
Simi-Supervised Recognition Challenge-FGVC7 is a challenging fine-grained recognition competition. One of the difficulties of this competition is how to use unlabeled data. We adopted pseudo-tag data mining to increase the amount of training data. The other one is how to identify similar birds with a very small difference, especially those have a relatively tiny main-body in examples. We combined…
▽ More
Simi-Supervised Recognition Challenge-FGVC7 is a challenging fine-grained recognition competition. One of the difficulties of this competition is how to use unlabeled data. We adopted pseudo-tag data mining to increase the amount of training data. The other one is how to identify similar birds with a very small difference, especially those have a relatively tiny main-body in examples. We combined generic image recognition and fine-grained image recognition method to solve the problem. All generic image recognition models were training using PaddleClas . Using the combination of two different ways of deep recognition models, we finally won the third place in the competition.
△ Less
Submitted 18 June, 2020;
originally announced June 2020.
-
A Redistribution Tool for Long-Term Archive of Astronomical Observation Data
Authors:
Chao Sun,
Ce Yu,
Chenzhou Cui,
Boliang He,
Jian Xiao,
Zhen Li,
Shanjiang Tang,
Jizhou Sun
Abstract:
Astronomical observation data require long-term preservation, and the rapid accumulation of observation data makes it necessary to consider the cost of long-term archive storage. In addition to low-speed disk-based online storage, optical disk or tape-based offline storage can be used to save costs. However, for astronomical research that requires historical data (particularly time-domain astronom…
▽ More
Astronomical observation data require long-term preservation, and the rapid accumulation of observation data makes it necessary to consider the cost of long-term archive storage. In addition to low-speed disk-based online storage, optical disk or tape-based offline storage can be used to save costs. However, for astronomical research that requires historical data (particularly time-domain astronomy), the performance and energy consumption of data-accessing techniques cause problems because the requested data (which are organized according to observation time) may be located across multiple storage devices. In this study, we design and develop a tool referred to as AstroLayout to redistribute the observation data using spatial aggregation. The core algorithm uses graph partitioning to generate an optimized data placement according to the original observation data statistics and the target storage system. For the given observation data, AstroLayout can copy the long-term archive in the target storage system in accordance with this placement. An efficiency evaluation shows that AstroLayout can reduce the number of devices activated when responding to data-access requests in time-domain astronomy research. In addition to improving the performance of data-accessing techniques, AstroLayout can also reduce the storage systems power consumption. For enhanced adaptability, it supports storage systems of any media, including optical disks, tapes, and hard disks.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Towards an Astronomical Science Platform: Experiences and Lessons Learned from Chinese Virtual Observatory
Authors:
Chenzhou Cui,
Yihan Tao,
Changhua Li,
Dongwei Fan,
Jian Xiao,
Boliang He,
Shanshan Li,
Ce Yu,
Linying Mi,
Yunfei Xu,
Jun Han,
Sisi Yang,
Yongheng Zhao,
Yanjie Xue,
**xin Hao,
Liang Liu,
Xiao Chen,
Junyi Chen,
Hailong Zhang
Abstract:
In the era of big data astronomy, next generation telescopes and large sky surveys produce data sets at the TB or even PB level. Due to their large data volumes, these astronomical data sets are extremely difficult to transfer and analyze using personal computers or small clusters. In order to offer better access to data, data centers now generally provide online science platforms that enable anal…
▽ More
In the era of big data astronomy, next generation telescopes and large sky surveys produce data sets at the TB or even PB level. Due to their large data volumes, these astronomical data sets are extremely difficult to transfer and analyze using personal computers or small clusters. In order to offer better access to data, data centers now generally provide online science platforms that enable analysis close to the data. The Chinese Virtual Observatory (China-VO) is one of the member projects in the International Virtual Observatory Alliance and it is dedicated to providing a research and education environment where globally distributed astronomy archives are simple to find, access, and interoperate. In this study, we summarize highlights of the work conducted at the China-VO, as well the experiences and lessons learned during the full life-cycle management of astronomical data. Finally, We discuss the challenges and future trends for astronomical science platforms.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
SUPER: A Novel Lane Detection System
Authors:
**** Lu,
Chen Cui,
Shaobing Xu,
Huei Peng,
Fan Wang
Abstract:
AI-based lane detection algorithms were actively studied over the last few years. Many have demonstrated superior performance compared with traditional feature-based methods. The accuracy, however, is still generally in the low 80% or high 90%, or even lower when challenging images are used. In this paper, we propose a real-time lane detection system, called Scene Understanding Physics-Enhanced Re…
▽ More
AI-based lane detection algorithms were actively studied over the last few years. Many have demonstrated superior performance compared with traditional feature-based methods. The accuracy, however, is still generally in the low 80% or high 90%, or even lower when challenging images are used. In this paper, we propose a real-time lane detection system, called Scene Understanding Physics-Enhanced Real-time (SUPER) algorithm. The proposed method consists of two main modules: 1) a hierarchical semantic segmentation network as the scene feature extractor and 2) a physics enhanced multi-lane parameter optimization module for lane inference. We train the proposed system using heterogeneous data from Cityscapes, Vistas and Apollo, and evaluate the performance on four completely separate datasets (that were never seen before), including Tusimple, Caltech, URBAN KITTI-ROAD, and X-3000. The proposed approach performs the same or better than lane detection models already trained on the same dataset and performs well even on datasets it was never trained on. Real-world vehicle tests were also conducted. Preliminary test results show promising real-time lane-detection performance compared with the Mobileye.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Towards Accurate and Robust Domain Adaptation under Noisy Environments
Authors:
Zhongyi Han,
Xian-** Gui,
Chaoran Cui,
Yilong Yin
Abstract:
In non-stationary environments, learning machines usually confront the domain adaptation scenario where the data distribution does change over time. Previous domain adaptation works have achieved great success in theory and practice. However, they always lose robustness in noisy environments where the labels and features of examples from the source domain become corrupted. In this paper, we report…
▽ More
In non-stationary environments, learning machines usually confront the domain adaptation scenario where the data distribution does change over time. Previous domain adaptation works have achieved great success in theory and practice. However, they always lose robustness in noisy environments where the labels and features of examples from the source domain become corrupted. In this paper, we report our attempt towards achieving accurate noise-robust domain adaptation. We first give a theoretical analysis that reveals how harmful noises influence unsupervised domain adaptation. To eliminate the effect of label noise, we propose an offline curriculum learning for minimizing a newly-defined empirical source risk. To reduce the impact of feature noise, we propose a proxy distribution based margin discrepancy. We seamlessly transform our methods into an adversarial network that performs efficient joint optimization for them, successfully mitigating the negative influence from both data corruption and distribution shift. A series of empirical studies show that our algorithm remarkably outperforms state of the art, over 10% accuracy improvements in some domain adaptation tasks under noisy environments.
△ Less
Submitted 4 May, 2020; v1 submitted 26 April, 2020;
originally announced April 2020.
-
Experimental evidence of monolayer AlB$_2$ with symmetry-protected Dirac cones
Authors:
Daiyu Geng,
Kejun Yu,
Shaosheng Yue,
** Cao,
Wenbin Li,
Dashuai Ma,
Chaoxi Cui,
Masashi Arita,
Shiv Kumar,
Eike F. Schwier,
Kenya Shimada,
Peng Cheng,
Lan Chen,
Kehui Wu,
Yugui Yao,
Baojie Feng
Abstract:
Monolayer AlB$_2$ is composed of two atomic layers: honeycomb borophene and triangular aluminum. In contrast with the bulk phase, monolayer AlB$_2$ is predicted to be a superconductor with a high critical temperature. Here, we demonstrate that monolayer AlB$_2$ can be synthesized on Al(111) via molecular beam epitaxy. Our theoretical calculations revealed that the monolayer AlB$_2$ hosts several D…
▽ More
Monolayer AlB$_2$ is composed of two atomic layers: honeycomb borophene and triangular aluminum. In contrast with the bulk phase, monolayer AlB$_2$ is predicted to be a superconductor with a high critical temperature. Here, we demonstrate that monolayer AlB$_2$ can be synthesized on Al(111) via molecular beam epitaxy. Our theoretical calculations revealed that the monolayer AlB$_2$ hosts several Dirac cones along the $Γ$--M and $Γ$--K directions; these Dirac cones are protected by crystal symmetries and are thus resistant to external perturbations. The extraordinary electronic structure of the monolayer AlB$_2$ was confirmed via angle-resolved photoemission spectroscopy measurements. These results are likely to stimulate further research interest to explore the exotic properties arising from the interplay of Dirac fermions and superconductivity in two-dimensional materials.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.
-
Plasticity-Enhanced Domain-Wall MTJ Neural Networks for Energy-Efficient Online Learning
Authors:
Christopher H. Bennett,
T. Patrick Xiao,
Can Cui,
Naimul Hassan,
Otitoaleke G. Akinola,
Jean Anne C. Incorvia,
Alvaro Velasquez,
Joseph S. Friedman,
Matthew J. Marinella
Abstract:
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical prop…
▽ More
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20 $μJ$ even for large tasks used typically in machine learning.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
IVOA HiPS Implementation in the Framework of WorldWide Telescope
Authors:
Yunfei Xu,
Chenzhou Cui,
Dongwei Fan,
Shanshan Li,
Changhua Li,
Jun Han,
Linying Mi,
Boliang He,
Hanxi Yang,
Yihan Tao,
Sisi Yang,
Lan He
Abstract:
The WorldWide Telescope(WWT) is a scientific visualization platform which can browse deep space images, star catalogs, and planetary remote sensing data from different observation facilities in a three-dimensional virtual scene. First launched and then open-sourced by Microsoft Research, the WWT is now managed by the American Astronomical Society (AAS). Hierarchical Progressive Survey (HiPS) is an…
▽ More
The WorldWide Telescope(WWT) is a scientific visualization platform which can browse deep space images, star catalogs, and planetary remote sensing data from different observation facilities in a three-dimensional virtual scene. First launched and then open-sourced by Microsoft Research, the WWT is now managed by the American Astronomical Society (AAS). Hierarchical Progressive Survey (HiPS) is an astronomical data release scheme proposed by Centre de Données astronomiques de Strasbourg (CDS) and has been accepted as a recommendation by International Virtual Observatory Alliance (IVOA). The HiPS solution has been adopted widely by many astronomical institutions for data release. Since WWT selected Hierarchical Triangular Mesh (HTM) as the standard for data visualization in the early stage of development, data released by HiPS cannot be visualized in WWT, which significantly limits the application of WWT. This paper introduces the implementation method for HiPS dataset visualization in WWT, and introduces HiPS data projection, mesh rendering, and data index implementation in WWT. Taking Chang'E-2 lunar probe data as an example, this paper introduces how to convert planetary remote sensing data into a HiPS dataset and integrate it into WWT. This paper also compares the efficiency and memory consumption of WWT loading its native data and HiPS data, and illustrates the application of HiPS in scientific data visualization and science education in WWT.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Condensed Generalized Finite Element Method (CGFEM)
Authors:
Qinghui Zhang,
Cu Cui
Abstract:
Generalized or extended finite element methods (GFEM/XFEM) are in general badly conditioned and have numerous additional degrees of freedom (DOF) compared with the FEM because of introduction of enriched functions. In this paper, we develop an approach to establish a subspace of a conventional GFEM/XFEM approximation space using partition of unity (PU) techniques and local least square procedures.…
▽ More
Generalized or extended finite element methods (GFEM/XFEM) are in general badly conditioned and have numerous additional degrees of freedom (DOF) compared with the FEM because of introduction of enriched functions. In this paper, we develop an approach to establish a subspace of a conventional GFEM/XFEM approximation space using partition of unity (PU) techniques and local least square procedures. The proposed GFEM is referred to as condensed GFEM (CGFEM), which (i) possesses as many DOFs as the preliminary FEM, (ii) enjoys similar approximation properties with the GFEM/XFEM, and (iii) is well-conditioned in a sense that its conditioning is of the same order as that of the FEM. The fundamental approximation properties of CGFEM is proven mathematically. The CGFEM is applied to a problem of high order polynomial approximations and a Poisson crack problem; optimal convergence orders of the former are proven rigorously. The numerical experiments and comparisons with the conventional GFEM/XFEM and FEM are made to verify the theory and effectiveness of CGFEM.
△ Less
Submitted 2 February, 2020;
originally announced February 2020.
-
Maximized Lateral Inhibition in Paired Magnetic Domain Wall Racetracks for Neuromorphic Computing
Authors:
C. Cui,
O. G. Akinola,
N. Hassan,
C. H. Bennett,
M. J. Marinella,
J. S. Friedman,
J. A. C. Incorvia
Abstract:
Lateral inhibition is an important functionality in neuromorphic computing, modeled after the biological neuron behavior that a firing neuron deactivates its neighbors belonging to the same layer and prevents them from firing. In most neuromorphic hardware platforms lateral inhibition is implemented by external circuitry, thereby decreasing the energy efficiency and increasing the area overhead of…
▽ More
Lateral inhibition is an important functionality in neuromorphic computing, modeled after the biological neuron behavior that a firing neuron deactivates its neighbors belonging to the same layer and prevents them from firing. In most neuromorphic hardware platforms lateral inhibition is implemented by external circuitry, thereby decreasing the energy efficiency and increasing the area overhead of such systems. Recently, the domain wall -- magnetic tunnel junction (DW-MTJ) artificial neuron is demonstrated in modeling to be inherently inhibitory. Without peripheral circuitry, lateral inhibition in DW-MTJ neurons results from magnetostatic interaction between neighboring neuron cells. However, the lateral inhibition mechanism in DW-MTJ neurons has not been studied thoroughly, leading to weak inhibition only in very closely-spaced devices. This work approaches these problems by modeling current- and field- driven DW motion in a pair of adjacent DW-MTJ neurons. We maximize the magnitude of lateral inhibition by tuning the magnetic interaction between the neurons. The results are explained by current-driven DW velocity characteristics in response to external magnetic field and quantified by an analytical model. Finally, the dependence of lateral inhibition strength on device parameters is investigated. This provides a guideline for the optimization of lateral inhibition implementation in DW-MTJ neurons. With strong lateral inhibition achieved, a path towards competitive learning algorithms such as the winner-take-all are made possible on such neuromorphic devices.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Global Simulations of the Vertical Shear Instability with Non-ideal Magnetohydrodynamical Effects
Authors:
Can Cui,
Xue-Ning Bai
Abstract:
The mechanisms of angular momentum transport and level of turbulence in protoplanetary disks (PPDs) are crucial for understanding many aspects of planet formation. In the recent years, it has been realized that the magneto-rotational instability (MRI) tends to be suppressed in PPDs due to non-ideal MHD effects, and the disk is largely laminar with accretion driven by magnetized disk winds. In para…
▽ More
The mechanisms of angular momentum transport and level of turbulence in protoplanetary disks (PPDs) are crucial for understanding many aspects of planet formation. In the recent years, it has been realized that the magneto-rotational instability (MRI) tends to be suppressed in PPDs due to non-ideal MHD effects, and the disk is largely laminar with accretion driven by magnetized disk winds. In parallel, several hydrodynamical mechanisms have been identified that likely also generate vigorous turbulence and drive disk accretion. We study the interplay between MHD winds in PPDs with the vertical shear instability (VSI), one of the most promising hydrodynamical mechanisms, through 2D global non-ideal MHD simulations with ambipolar diffusion and Ohmic resistivity. We find that for typical disk parameters, MHD winds can coexist with the VSI with accretion primarily wind-driven accompanied by vigorous VSI turbulence. The properties of the VSI remain similar to unmagnetized case, and the wind and overall field configuration are not strongly affected by VSI turbulence, showing modest level of variability and corrugation of midplane current sheet. Enhanced coupling between gas and magnetic field weakens the VSI. The VSI is also weakened with increasing magnetization, and we find that corrugation motions characteristic of the VSI transitions to low-amplitude breathing mode oscillations.
△ Less
Submitted 29 January, 2020; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Deep Physiological State Space Model for Clinical Forecasting
Authors:
Yuan Xue,
Denny Zhou,
Nan Du,
Andrew Dai,
Zhen Xu,
Kun Zhang,
Claire Cui
Abstract:
Clinical forecasting based on electronic medical records (EMR) can uncover the temporal correlations between patients' conditions and outcomes from sequences of longitudinal clinical measurements. In this work, we propose an intervention-augmented deep state space generative model to capture the interactions among clinical measurements and interventions by explicitly modeling the dynamics of patie…
▽ More
Clinical forecasting based on electronic medical records (EMR) can uncover the temporal correlations between patients' conditions and outcomes from sequences of longitudinal clinical measurements. In this work, we propose an intervention-augmented deep state space generative model to capture the interactions among clinical measurements and interventions by explicitly modeling the dynamics of patients' latent states. Based on this model, we are able to make a joint prediction of the trajectories of future observations and interventions. Empirical evaluations show that our proposed model compares favorably to several state-of-the-art methods on real EMR data.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
The International Virtual Observatory Alliance in 2019
Authors:
Mark A. Allen,
Patrick Dowler,
Janet D. Evans,
Chenzhou Cui,
Tim Jenness,
Bruno Merin,
G. Bruce Berriman,
J. J. Kavelaars
Abstract:
The International Virtual Observatory Alliance (IVOA) held its bi-annual Interoperability Meetings in May 2019, and in October 2019 following the ADASS XXIX conference. We provide a brief report on the status of the IVOA and the activities of the Interoperability Meetings.
The International Virtual Observatory Alliance (IVOA) held its bi-annual Interoperability Meetings in May 2019, and in October 2019 following the ADASS XXIX conference. We provide a brief report on the status of the IVOA and the activities of the Interoperability Meetings.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
2nd Place Solution in Google AI Open Images Object Detection Track 2019
Authors:
Ruoyu Guo,
Cheng Cui,
Yuning Du,
Xianglong Meng,
Xiaodi Wang,
**gwei Liu,
Jianfeng Zhu,
Yuan Feng,
Shumin Han
Abstract:
We present an object detection framework based on PaddlePaddle. We put all the strategies together (multi-scale training, FPN, Cascade, Dcnv2, Non-local, libra loss) based on ResNet200-vd backbone. Our model score on public leaderboard comes to 0.6269 with single scale test. We proposed a new voting method called top-k voting-nms, based on the SoftNMS detection results. The voting method helps us…
▽ More
We present an object detection framework based on PaddlePaddle. We put all the strategies together (multi-scale training, FPN, Cascade, Dcnv2, Non-local, libra loss) based on ResNet200-vd backbone. Our model score on public leaderboard comes to 0.6269 with single scale test. We proposed a new voting method called top-k voting-nms, based on the SoftNMS detection results. The voting method helps us merge all the models' results more easily and achieve 2nd place in the Google AI Open Images Object Detection Track 2019.
△ Less
Submitted 17 November, 2019;
originally announced November 2019.
-
Modelling EHR timeseries by restricting feature interaction
Authors:
Kun Zhang,
Yuan Xue,
Gerardo Flores,
Alvin Rajkomar,
Claire Cui,
Andrew M. Dai
Abstract:
Time series data are prevalent in electronic health records, mostly in the form of physiological parameters such as vital signs and lab tests. The patterns of these values may be significant indicators of patients' clinical states and there might be patterns that are unknown to clinicians but are highly predictive of some outcomes. Many of these values are also missing which makes it difficult to…
▽ More
Time series data are prevalent in electronic health records, mostly in the form of physiological parameters such as vital signs and lab tests. The patterns of these values may be significant indicators of patients' clinical states and there might be patterns that are unknown to clinicians but are highly predictive of some outcomes. Many of these values are also missing which makes it difficult to apply existing methods like decision trees. We propose a recurrent neural network model that reduces overfitting to noisy observations by limiting interactions between features. We analyze its performance on mortality, ICD-9 and AKI prediction from observational values on the Medical Information Mart for Intensive Care III (MIMIC-III) dataset. Our models result in an improvement of 1.1% [p<0.01] in AU-ROC for mortality prediction under the MetaVision subset and 1.0% and 2.2% [p<0.01] respectively for mortality and AKI under the full MIMIC-III dataset compared to existing state-of-the-art interpolation, embedding and decay-based recurrent models.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
Large-scale dynamics of winds originated from black hole accretion flows: (I) Hydrodynamics
Authors:
Can Cui,
Feng Yuan,
Bo Li
Abstract:
Winds from black hole accretion flows are ubiquitous. Previous works mainly focus on the launching of wind in the accretion flow scale. It still remains unclear how far the winds can propagate outward and what is their large-scale dynamics. As the first paper of this series, we study the large-scale dynamics of thermal wind beyond accretion scales via analytical and numerical methods. Boundary con…
▽ More
Winds from black hole accretion flows are ubiquitous. Previous works mainly focus on the launching of wind in the accretion flow scale. It still remains unclear how far the winds can propagate outward and what is their large-scale dynamics. As the first paper of this series, we study the large-scale dynamics of thermal wind beyond accretion scales via analytical and numerical methods. Boundary conditions, which are crucial to our problem, are analyzed and presented based on the small-scale simulations combined with observations of winds. Both black hole and galaxy potential are taken into account. For winds originated from hot accretion flows, we find that the wind can reach to large scales. The radial profiles of velocity, density, and temperature can be approximated by $v_r\approx v_{r0}, ρ\approx ρ_{0}(r/r_0)^{-2}$, and $T\approx T_0 (r/r_0)^{-2(γ-1)}$, where $v_{r0}, ρ_0, T_0$ are the velocity, density, and temperature of winds at the boundary $r_0(\equiv 10^3 r_g)$, $γ$ is the polytropic index. During the outward propagation, the enthalpy and the rotational energy compensate the increase of gravitational potential. For thin disks, we find that because the Bernoulli parameter is smaller, winds cannot propagate as far as the hot winds, but stop at a certain radius where the Bernoulli parameter is equal to the potential energy. Before the winds stop, the profiles of dynamical quantities can also be approximated by the above relations. In this case the rotational energy alone compensates the increase of the potential energy.
△ Less
Submitted 14 February, 2020; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Large-scale dynamics of winds originated from black hole accretion flows: (II) Magnetohydrodynamics
Authors:
Can Cui,
Feng Yuan
Abstract:
Winds from black hole accretion disks are essential ingredients in understanding the coevolution between the supermassive black hole and its host galaxy. The great difference of dynamical ranges from small-scale accretion disk simulations to large-scale or cosmological simulations places barriers to track wind kinematics. In the first paper of this series, we have studied the dynamics of disk wind…
▽ More
Winds from black hole accretion disks are essential ingredients in understanding the coevolution between the supermassive black hole and its host galaxy. The great difference of dynamical ranges from small-scale accretion disk simulations to large-scale or cosmological simulations places barriers to track wind kinematics. In the first paper of this series, we have studied the dynamics of disk winds from the outer edge of the accretion disk toward galaxy scales in the hydrodynamical framework. In this paper, we further incorporate magnetic fields to understand the wind dynamics by adopting one-dimensional magnetohydrodynamical (MHD) model, with boundary conditions set for hot accretion flows. The geometry of poloidal magnetic field is prescribed as a straight line with an angle $θ=45^\circ$ from the rotational axis, and the strength satisfies the divergence free condition. The wind solution is achieved through requesting gas to pass through the slow, Alfvén and fast magneto-sonic points smoothly. Physical quantities are found to show a power-law dependence on cylindrical radius $R$ beyond the fast magneto-sonic point, for which $ρ\propto R^{-2}, v_{\rm p}\propto {\rm const.}, v_{\rm φ}\propto R^{-1}, B_{\rm φ}\propto R^{-1},$ and $ β\propto ρ^{γ-1}$. The magnetization of wind is dominant in determining the wind properties. The wind is accelerated to a greater terminal velocity with strong magnetization ($v_{\rm Ap0}>1$) compared to the hydrodynamical case, which the magnetic pressure gradient dominates and the centrifugal potential converts to the kinetic energy. The dependance of wind physical quantities on magnetization, temperature, field line angular velocity, and adiabatic index is also discussed.
△ Less
Submitted 14 February, 2020; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Active Subspace of Neural Networks: Structural Analysis and Universal Attacks
Authors:
Chunfeng Cui,
Kaiqi Zhang,
Talgat Daulbaev,
Julia Gusak,
Ivan Oseledets,
Zheng Zhang
Abstract:
Active subspace is a model reduction method widely used in the uncertainty quantification community. In this paper, we propose analyzing the internal structure and vulnerability and deep neural networks using active subspace. Firstly, we employ the active subspace to measure the number of "active neurons" at each intermediate layer and reduce the number of neurons from several thousands to several…
▽ More
Active subspace is a model reduction method widely used in the uncertainty quantification community. In this paper, we propose analyzing the internal structure and vulnerability and deep neural networks using active subspace. Firstly, we employ the active subspace to measure the number of "active neurons" at each intermediate layer and reduce the number of neurons from several thousands to several dozens. This motivates us to change the network structure and to develop a new and more compact network, referred to as {ASNet}, that has significantly fewer model parameters. Secondly, we propose analyzing the vulnerability of a neural network using active subspace and finding an additive universal adversarial attack vector that can misclassify a dataset with a high probability. Our experiments on CIFAR-10 show that ASNet can achieve 23.98$\times$ parameter and 7.30$\times$ flops reduction. The universal active subspace attack vector can achieve around 20% higher attack ratio compared with the existing approach in all of our numerical experiments. The PyTorch codes for this paper are available online.
△ Less
Submitted 29 April, 2020; v1 submitted 28 October, 2019;
originally announced October 2019.
-
Tensor Methods for Generating Compact Uncertainty Quantification and Deep Learning Models
Authors:
Chunfeng Cui,
Cole Hawkins,
Zheng Zhang
Abstract:
Tensor methods have become a promising tool to solve high-dimensional problems in the big data era. By exploiting possible low-rank tensor factorization, many high-dimensional model-based or data-driven problems can be solved to facilitate decision making or machine learning. In this paper, we summarize the recent applications of tensor computation in obtaining compact models for uncertainty quant…
▽ More
Tensor methods have become a promising tool to solve high-dimensional problems in the big data era. By exploiting possible low-rank tensor factorization, many high-dimensional model-based or data-driven problems can be solved to facilitate decision making or machine learning. In this paper, we summarize the recent applications of tensor computation in obtaining compact models for uncertainty quantification and deep learning. In uncertainty analysis where obtaining data samples is expensive, we show how tensor methods can significantly reduce the simulation or measurement cost. To enable the deployment of deep learning on resource-constrained hardware platforms, tensor methods can be used to significantly compress an over-parameterized neural network model or directly train a small-size model from scratch via optimization or statistical techniques. Recent Bayesian tensorized neural networks can automatically determine their tensor ranks in the training process.
△ Less
Submitted 20 August, 2019;
originally announced August 2019.