-
Solving Motion Planning Tasks with a Scalable Generative Model
Authors:
Yihan Hu,
Siqi Chai,
Zhening Yang,
**gyu Qian,
Kun Li,
Wenxin Shao,
Haichao Zhang,
Wei Xu,
Qiang Liu
Abstract:
As autonomous driving systems being deployed to millions of vehicles, there is a pressing need of improving the system's scalability, safety and reducing the engineering cost. A realistic, scalable, and practical simulator of the driving world is highly desired. In this paper, we present an efficient solution based on generative models which learns the dynamics of the driving scenes. With this mod…
▽ More
As autonomous driving systems being deployed to millions of vehicles, there is a pressing need of improving the system's scalability, safety and reducing the engineering cost. A realistic, scalable, and practical simulator of the driving world is highly desired. In this paper, we present an efficient solution based on generative models which learns the dynamics of the driving scenes. With this model, we can not only simulate the diverse futures of a given driving scenario but also generate a variety of driving scenarios conditioned on various prompts. Our innovative design allows the model to operate in both full-Autoregressive and partial-Autoregressive modes, significantly improving inference and training speed without sacrificing generative capability. This efficiency makes it ideal for being used as an online reactive environment for reinforcement learning, an evaluator for planning policies, and a high-fidelity simulator for testing. We evaluated our model against two real-world datasets: the Waymo motion dataset and the nuPlan dataset. On the simulation realism and scene generation benchmark, our model achieves the state-of-the-art performance. And in the planning benchmarks, our planner outperforms the prior arts. We conclude that the proposed generative model may serve as a foundation for a variety of motion planning tasks, including data generation, simulation, planning, and online training. Source code is public at https://github.com/HorizonRobotics/GUMP/
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Integrating Medical Imaging and Clinical Reports Using Multimodal Deep Learning for Advanced Disease Analysis
Authors:
Ziyan Yao,
Fei Lin,
Sheng Chai,
Weijie He,
Lu Dai,
Xinghui Fei
Abstract:
In this paper, an innovative multi-modal deep learning model is proposed to deeply integrate heterogeneous information from medical images and clinical reports. First, for medical images, convolutional neural networks were used to extract high-dimensional features and capture key visual information such as focal details, texture and spatial distribution. Secondly, for clinical report text, a two-w…
▽ More
In this paper, an innovative multi-modal deep learning model is proposed to deeply integrate heterogeneous information from medical images and clinical reports. First, for medical images, convolutional neural networks were used to extract high-dimensional features and capture key visual information such as focal details, texture and spatial distribution. Secondly, for clinical report text, a two-way long and short-term memory network combined with an attention mechanism is used for deep semantic understanding, and key statements related to the disease are accurately captured. The two features interact and integrate effectively through the designed multi-modal fusion layer to realize the joint representation learning of image and text. In the empirical study, we selected a large medical image database covering a variety of diseases, combined with corresponding clinical reports for model training and validation. The proposed multimodal deep learning model demonstrated substantial superiority in the realms of disease classification, lesion localization, and clinical description generation, as evidenced by the experimental results.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Faraday: Synthetic Smart Meter Generator for the smart grid
Authors:
Sheng Chai,
Gus Chadney
Abstract:
Access to smart meter data is essential to rapid and successful transitions to electrified grids, underpinned by flexibility delivered by low carbon technologies, such as electric vehicles (EV) and heat pumps, and powered by renewable energy. Yet little of this data is available for research and modelling purposes due consumer privacy protections. Whilst many are calling for raw datasets to be unl…
▽ More
Access to smart meter data is essential to rapid and successful transitions to electrified grids, underpinned by flexibility delivered by low carbon technologies, such as electric vehicles (EV) and heat pumps, and powered by renewable energy. Yet little of this data is available for research and modelling purposes due consumer privacy protections. Whilst many are calling for raw datasets to be unlocked through regulatory changes, we believe this approach will take too long. Synthetic data addresses these challenges directly by overcoming privacy issues. In this paper, we present Faraday, a Variational Auto-encoder (VAE)-based model trained over 300 million smart meter data readings from an energy supplier in the UK, with information such as property type and low carbon technologies (LCTs) ownership. The model produces household-level synthetic load profiles conditioned on these labels, and we compare its outputs against actual substation readings to show how the model can be used for real-world applications by grid modellers interested in modelling energy grids of the future.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Measurement of the Chern Number for Non-Hermitian Chern Insulators
Authors:
Hongfang Liu,
Ming Lu,
Shengdu Chai,
Zhi-Qiang Zhang,
Hua Jiang
Abstract:
The identification of the topological invariant of a topological system is crucial in experiments. However, due to the inherent non-Hermitian features, such determination is notably challenging in non-Hermitian systems. Here, we propose that the magnetic effect can be utilized to measure the Chern number of the non-Hermitian Chern insulator. We find that the splitting of non-Hermitian bands under…
▽ More
The identification of the topological invariant of a topological system is crucial in experiments. However, due to the inherent non-Hermitian features, such determination is notably challenging in non-Hermitian systems. Here, we propose that the magnetic effect can be utilized to measure the Chern number of the non-Hermitian Chern insulator. We find that the splitting of non-Hermitian bands under the magnetic field is Chern number dependent. Consequently, one can easily identify the Chern number by analyzing these splitting sub-bands. From the experimental perspective, the measurement of non-Hermitian bands is demonstrated in LC electric circuits. Furthermore, we find that the non-Hermiticity can drive open (closed) orbits of sub-bands in the Hermitian limit closed (open), which can also be identified by our proposal. These phenomena highlight the distinctive capabilities of non-Hermitian systems. Our results facilitate the detection of Chern numbers for non-Hermitian systems and may motivate further studies of their topological properties.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
From Optimal Observables to Machine Learning: an Effective-Field-Theory Analysis of $e^+e^- \to W^+W^-$ at Future Lepton Colliders
Authors:
Shengdu Chai,
Jiayin Gu,
Lingfeng Li
Abstract:
We apply machine-learning techniques to the effective-field-theory analysis of the $e^+e^- \to W^+W^-$ processes at future lepton colliders, and demonstrate their advantages in comparison with conventional methods, such as optimal observables. Compared to traditional algorithms, we show that simulation-based inference methods are more robust to detector effects and backgrounds, and could in princi…
▽ More
We apply machine-learning techniques to the effective-field-theory analysis of the $e^+e^- \to W^+W^-$ processes at future lepton colliders, and demonstrate their advantages in comparison with conventional methods, such as optimal observables. Compared to traditional algorithms, we show that simulation-based inference methods are more robust to detector effects and backgrounds, and could in principle produce unbiased results with sufficient Monte Carlo simulation samples that accurately describe experiments. This is crucial for the analyses at future lepton colliders given the outstanding precision of the $e^+e^- \to W^+W^-$ measurement ($\sim 10^{-4}$ in terms of anomalous triple gauge couplings or even better) that can be reached. Our framework can be generalized to other effective-field-theory analyses, such as the one of $e^+e^- \to t\bar{t}$ or similar processes at muon colliders.
△ Less
Submitted 30 June, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
Residual Stress-Driven Non-Euclidean Morphing in Origami Structures
Authors:
Zihe Liang,
Sibo Chai,
Qinyun Ding,
Kai Xiao,
Ke Liu,
Jiayao Ma,
Jaehyung Ju
Abstract:
Non-Euclidean surfaces are ubiquitous in numerous engineering fields, such as automotive, aerospace, and biomedical engineering domains. Morphing origami has numerous potential engineering applications, including soft robots, mechanical metamaterials, antennas, aerospace structures, and biomedical devices, owing to its intrinsic morphing features from two-dimensional (2D) planes to three-dimension…
▽ More
Non-Euclidean surfaces are ubiquitous in numerous engineering fields, such as automotive, aerospace, and biomedical engineering domains. Morphing origami has numerous potential engineering applications, including soft robots, mechanical metamaterials, antennas, aerospace structures, and biomedical devices, owing to its intrinsic morphing features from two-dimensional (2D) planes to three-dimensional (3D) surfaces. However, the current one-dimensional (1D) hinge deformation-driven transformation of foldable origami with rigid or slightly deformable panels cannot achieve a 3D complex and large curvilinear morphing. Moreover, most active origami structures use thin hinges with soft materials on their creases, thus resulting in a lower load capability. This study proposes a novel origami morphing method that demonstrates large free-form surface morphing, e.g., Euclidean to non-Euclidean surface morphing with shape-locking. We embedded tensorial anisotropic stress in origami panels during the extrusion-based 3D printing of shape memory polymers. The extrusion-based 3D printing of isotropic shape memory polymers can produce tensorial anisotropic stress in origami panels during fabrication, which can realize large non-Euclidean surface morphing with multiple deformation modes. The connecting topology of the origami unit cells influences the global morphing behavior owing to the interaction of the deformation of adjacent panels. Non-Euclidean morphing integrated with four-dimensional (4D) printing can provide multimodal shape locking at material and structural levels. The non-Euclidean surface morphing caused by tensorial residual stress in the panel during 3D printing expands the design space of origami and kirigami structures.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
ITI-GEN: Inclusive Text-to-Image Generation
Authors:
Cheng Zhang,
Xuanbai Chen,
Siqi Chai,
Chen Henry Wu,
Dmitry Lagun,
Thabo Beeler,
Fernando De la Torre
Abstract:
Text-to-image generative models often reflect the biases of the training data, leading to unequal representations of underrepresented groups. This study investigates inclusive text-to-image generative models that generate images based on human-written prompts and ensure the resulting images are uniformly distributed across attributes of interest. Unfortunately, directly expressing the desired attr…
▽ More
Text-to-image generative models often reflect the biases of the training data, leading to unequal representations of underrepresented groups. This study investigates inclusive text-to-image generative models that generate images based on human-written prompts and ensure the resulting images are uniformly distributed across attributes of interest. Unfortunately, directly expressing the desired attributes in the prompt often leads to sub-optimal results due to linguistic ambiguity or model misrepresentation. Hence, this paper proposes a drastically different approach that adheres to the maxim that "a picture is worth a thousand words". We show that, for some attributes, images can represent concepts more expressively than text. For instance, categories of skin tones are typically hard to specify by text but can be easily represented by example images. Building upon these insights, we propose a novel approach, ITI-GEN, that leverages readily available reference images for Inclusive Text-to-Image GENeration. The key idea is learning a set of prompt embeddings to generate images that can effectively represent all desired attribute categories. More importantly, ITI-GEN requires no model fine-tuning, making it computationally efficient to augment existing text-to-image models. Extensive experiments demonstrate that ITI-GEN largely improves over state-of-the-art models to generate inclusive images from a prompt. Project page: https://czhang0528.github.io/iti-gen.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
A Cryogenic Tune and Match Circuit for Magnetic Resonance Microscopy at 15.2T
Authors:
Benjamin M. Hardy,
Gary Drake,
Shuyang Chai,
Bibek Dhakal,
Jonathan B. Martin,
Junzhong Xu,
Mark D. Does,
Adam W. Anderson,
Xinqiang Yan,
John C. Gore
Abstract:
Signal to noise ratios (SNR) in magnetic resonance microscopy images are limited by acquisition times and the decreasing number of spins in smaller voxels. Significant SNR gains from cooling of the RF receiver are only realized when the Johnson noise generated within the RF hardware is large compared to the electromagnetic noise produced by the sample. Cryogenic cooling of imaging probes is common…
▽ More
Signal to noise ratios (SNR) in magnetic resonance microscopy images are limited by acquisition times and the decreasing number of spins in smaller voxels. Significant SNR gains from cooling of the RF receiver are only realized when the Johnson noise generated within the RF hardware is large compared to the electromagnetic noise produced by the sample. Cryogenic cooling of imaging probes is common in high field systems but proves difficult to insulate the sample from extreme temperatures. We designed a chamber to cool only the tune and match circuitry to show it is possible to achieve much of the available SNR gain available for cooled coils. We designed a microcoil circuit to resonate at 650 MHz for imaging on a 15.2 T scanner. Surface loops and solenoids of varying diameters were tested to determine the largest diameter coil that demonstrated significant SNR gains from cooling. A liquid N2 cryochamber was designed to cool the circuitry while leaving the RF coil in ambient air. As the cryochamber was filled with liquid N2, Q-factors were measured on the bench while monitoring the coil's surface temperature. Improvements of SNR on images of solutions were demonstrated via cooling the tune and match circuit in the magnet bore. At 650 MHz, loops and solenoids < 3 mm in diameter showed significant improvements in quality factor on the bench. The resistance of the variable capacitors and the coaxial cable were measured to be 45% and 32% of room temperature values near the Larmor frequency. Images obtained with a 2 turn, 3 mm diameter loop with the matching circuit at room temperature and then cooled with liquid nitrogen demonstrated SNR improvements of a factor of 2. By cooling the tune and match circuit and leaving the surface loop in ambient air, SNR was improved by a factor of 2. The results are significant because it allows for more space to insulate the sample from extreme temperatures.
△ Less
Submitted 24 August, 2023;
originally announced August 2023.
-
Cross-modality Attention Adapter: A Glioma Segmentation Fine-tuning Method for SAM Using Multimodal Brain MR Images
Authors:
Xiaoyu Shi,
Shurong Chai,
Yinhao Li,
**gliang Cheng,
Jie Bai,
Guohua Zhao,
Yen-Wei Chen
Abstract:
According to the 2021 World Health Organization (WHO) Classification scheme for gliomas, glioma segmentation is a very important basis for diagnosis and genotype prediction. In general, 3D multimodal brain MRI is an effective diagnostic tool. In the past decade, there has been an increase in the use of machine learning, particularly deep learning, for medical images processing. Thanks to the devel…
▽ More
According to the 2021 World Health Organization (WHO) Classification scheme for gliomas, glioma segmentation is a very important basis for diagnosis and genotype prediction. In general, 3D multimodal brain MRI is an effective diagnostic tool. In the past decade, there has been an increase in the use of machine learning, particularly deep learning, for medical images processing. Thanks to the development of foundation models, models pre-trained with large-scale datasets have achieved better results on a variety of tasks. However, for medical images with small dataset sizes, deep learning methods struggle to achieve better results on real-world image datasets. In this paper, we propose a cross-modality attention adapter based on multimodal fusion to fine-tune the foundation model to accomplish the task of glioma segmentation in multimodal MRI brain images with better results. The effectiveness of the proposed method is validated via our private glioma data set from the First Affiliated Hospital of Zhengzhou University (FHZU) in Zhengzhou, China. Our proposed method is superior to current state-of-the-art methods with a Dice of 88.38% and Hausdorff distance of 10.64, thereby exhibiting a 4% increase in Dice to segment the glioma region for glioma treatment.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Towards Sustainable Ultrawide Bandgap Van der Waals Materials: An ab initio Screening Effort
Authors:
Chuin Wei Tan,
Linqiang Xu,
Chen Chen Er,
Siang-Piao Chai,
Boris Kozinsky,
Hui Ying Yang,
Shengyuan A. Yang,
**g Lu,
Yee Sin Ang
Abstract:
The sustainable development of next-generation device technology is paramount in the face of climate change and the looming energy crisis. Tremendous efforts have been made in the discovery and design of nanomaterials that achieve device-level sustainability, where high performance and low operational energy cost are prioritized. However, many of such materials are composed of elements that are un…
▽ More
The sustainable development of next-generation device technology is paramount in the face of climate change and the looming energy crisis. Tremendous efforts have been made in the discovery and design of nanomaterials that achieve device-level sustainability, where high performance and low operational energy cost are prioritized. However, many of such materials are composed of elements that are under threat of depletion and pose elevated risks to the environment. The role of material-level sustainability in computational screening efforts remains an open question thus far. Here we develop a general van der Waals materials screening framework imbued with sustainability-motivated search criteria. Using ultrawide bandgap (UWBG) materials as a backdrop -- an emerging materials class with great prospects in dielectric, power electronics, and ultraviolet device applications, we demonstrate how this screening framework results in 25 sustainable UWBG layered materials comprising only of low-risks elements. Our findings constitute a critical first-step towards reinventing a more sustainable electronics landscape beyond silicon, with the framework established in this work serving as a harbinger of sustainable 2D materials discovery.
△ Less
Submitted 25 October, 2023; v1 submitted 26 June, 2023;
originally announced June 2023.
-
Ladder Fine-tuning approach for SAM integrating complementary network
Authors:
Shurong Chai,
Rahul Kumar Jain,
Shiyu Teng,
Jiaqing Liu,
Yinhao Li,
Tomoko Tateyama,
Yen-wei Chen
Abstract:
Recently, foundation models have been introduced demonstrating various tasks in the field of computer vision. These models such as Segment Anything Model (SAM) are generalized models trained using huge datasets. Currently, ongoing research focuses on exploring the effective utilization of these generalized models for specific domains, such as medical imaging. However, in medical imaging, the lack…
▽ More
Recently, foundation models have been introduced demonstrating various tasks in the field of computer vision. These models such as Segment Anything Model (SAM) are generalized models trained using huge datasets. Currently, ongoing research focuses on exploring the effective utilization of these generalized models for specific domains, such as medical imaging. However, in medical imaging, the lack of training samples due to privacy concerns and other factors presents a major challenge for applying these generalized models to medical image segmentation task. To address this issue, the effective fine tuning of these models is crucial to ensure their optimal utilization. In this study, we propose to combine a complementary Convolutional Neural Network (CNN) along with the standard SAM network for medical image segmentation. To reduce the burden of fine tuning large foundation model and implement cost-efficient trainnig scheme, we focus only on fine-tuning the additional CNN network and SAM decoder part. This strategy significantly reduces trainnig time and achieves competitive results on publicly available dataset. The code is available at https://github.com/11yxk/SAM-LST.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
LayoutDM: Transformer-based Diffusion Model for Layout Generation
Authors:
Shang Chai,
Liansheng Zhuang,
Fengying Yan
Abstract:
Automatic layout generation that can synthesize high-quality layouts is an important tool for graphic design in many applications. Though existing methods based on generative models such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) have progressed, they still leave much room for improving the quality and diversity of the results. Inspired by the recent success of…
▽ More
Automatic layout generation that can synthesize high-quality layouts is an important tool for graphic design in many applications. Though existing methods based on generative models such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) have progressed, they still leave much room for improving the quality and diversity of the results. Inspired by the recent success of diffusion models in generating high-quality images, this paper explores their potential for conditional layout generation and proposes Transformer-based Layout Diffusion Model (LayoutDM) by instantiating the conditional denoising diffusion probabilistic model (DDPM) with a purely transformer-based architecture. Instead of using convolutional neural networks, a transformer-based conditional Layout Denoiser is proposed to learn the reverse diffusion process to generate samples from noised layout data. Benefitting from both transformer and DDPM, our LayoutDM is of desired properties such as high-quality generation, strong sample diversity, faithful distribution coverage, and stationary training in comparison to GANs and VAEs. Quantitative and qualitative experimental results show that our method outperforms state-of-the-art generative models in terms of quality and diversity.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Planning-oriented Autonomous Driving
Authors:
Yihan Hu,
Jiazhi Yang,
Li Chen,
Keyu Li,
Chonghao Sima,
Xizhou Zhu,
Siqi Chai,
Senyao Du,
Tianwei Lin,
Wenhai Wang,
Lewei Lu,
Xiaosong Jia,
Qiang Liu,
Jifeng Dai,
Yu Qiao,
Hongyang Li
Abstract:
Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of tasks and achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design a multi-task paradigm with separate heads. However, they might suffer from accumulative error…
▽ More
Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of tasks and achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design a multi-task paradigm with separate heads. However, they might suffer from accumulative errors or deficient task coordination. Instead, we argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car. Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning. We introduce Unified Autonomous Driving (UniAD), a comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query interfaces to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven by substantially outperforming previous state-of-the-arts in all aspects. Code and models are public.
△ Less
Submitted 23 March, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Accommodating the CDF W-boson Mass Measurement in the Beautiful Mirror Model
Authors:
Shengdu Chai,
Jiayin Gu,
Lian-Tao Wang
Abstract:
The W-boson mass measurement recently reported by the CDF II experiment exhibits a significant deviation from both the Standard Model prediction and previous measurements. There is also a long-standing deviation between the Standard Model prediction of the forward-backward asymmetry of the bottom quark ($A^{0,b}_{\rm FB}$) and its measurement at the LEP experiment. The Beautiful Mirror model, prop…
▽ More
The W-boson mass measurement recently reported by the CDF II experiment exhibits a significant deviation from both the Standard Model prediction and previous measurements. There is also a long-standing deviation between the Standard Model prediction of the forward-backward asymmetry of the bottom quark ($A^{0,b}_{\rm FB}$) and its measurement at the LEP experiment. The Beautiful Mirror model, proposed to resolve the $A^{0,b}_{\rm FB}$ discrepancy, introduces vector-like quarks that modify the W-boson mass at one-loop level. In this study, we find an interesting region in the model parameter space that could potentially explain both discrepancies, which puts the new quarks in the multi-TeV region. This region is mostly consistent with current LHC bounds from direct searches and Higgs coupling measurements, but will be thoroughly probed at the High Luminosity LHC. As such, the Beautiful Mirror model as an explanation of the $m_W$ and $A^{0,b}_{\rm FB}$ discrepancies could be confirmed or falsified in the near future.
△ Less
Submitted 19 May, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Security Defense of Large Scale Networks Under False Data Injection Attacks: An Attack Detection Scheduling Approach
Authors:
Yuhan Suo,
Senchun Chai,
Runqi Chai,
Zhong-Hua Pang,
Yuanqing Xia,
Guo-** Liu
Abstract:
In large-scale networks, communication links between nodes are easily injected with false data by adversaries. This paper proposes a novel security defense strategy from the perspective of attack detection scheduling to ensure the security of the network. Based on the proposed strategy, each sensor can directly exclude suspicious sensors from its neighboring set. First, the problem of selecting su…
▽ More
In large-scale networks, communication links between nodes are easily injected with false data by adversaries. This paper proposes a novel security defense strategy from the perspective of attack detection scheduling to ensure the security of the network. Based on the proposed strategy, each sensor can directly exclude suspicious sensors from its neighboring set. First, the problem of selecting suspicious sensors is formulated as a combinatorial optimization problem, which is non-deterministic polynomial-time hard (NP-hard). To solve this problem, the original function is transformed into a submodular function. Then, we propose an attack detection scheduling algorithm based on the sequential submodular optimization theory, which incorporates \emph{expert problem} to better utilize historical information to guide the sensor selection task at the current moment. For different attack strategies, theoretical results show that the average optimization rate of the proposed algorithm has a lower bound, and the error expectation is bounded. In addition, under two kinds of insecurity conditions, the proposed algorithm can guarantee the security of the entire network from the perspective of the augmented estimation error. Finally, the effectiveness of the developed method is verified by the numerical simulation and practical experiment.
△ Less
Submitted 17 December, 2023; v1 submitted 11 December, 2022;
originally announced December 2022.
-
Numerical analysis of a time discretized method for nonlinear filtering problem with Lévy process observations
Authors:
Fengshan Zhang,
Yongkui Zou,
Shimin Chai,
Yanzhao Cao
Abstract:
In this paper, we consider a nonlinear filtering model with observations driven by correlated Wiener processes and point processes. We first derive a Zakai equation whose solution is a unnormalized probability density function of the filter solution. Then we apply a splitting-up technique to decompose the Zakai equation into three stochastic differential equations, based on which we construct a sp…
▽ More
In this paper, we consider a nonlinear filtering model with observations driven by correlated Wiener processes and point processes. We first derive a Zakai equation whose solution is a unnormalized probability density function of the filter solution. Then we apply a splitting-up technique to decompose the Zakai equation into three stochastic differential equations, based on which we construct a splitting-up approximate solution and prove its half-order convergence. Furthermore, we apply a finite difference method to construct a time semi-discrete approximate solution to the splitting-up system and prove its half-order convergence to the exact solution of the Zakai equation. Finally, we present some numerical experiments to demonstrate the theoretical analysis.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
Weak Galerkin finite element method for linear poroelasticity problems
Authors:
Shanshan Gu,
Shimin Chai,
Chenguang Zhou
Abstract:
This paper is devoted to a weak Galerkin (WG) finite element method for linear poroelasticity problems where weakly defined divergence and gradient operators over discontinuous functions are introduced. We establish both the continuous and discrete time WG schemes, and obtain their optimal convergence order estimates in a discrete $H^1$ norm for the displacement and in an $H^1$ type and $L^2$ norm…
▽ More
This paper is devoted to a weak Galerkin (WG) finite element method for linear poroelasticity problems where weakly defined divergence and gradient operators over discontinuous functions are introduced. We establish both the continuous and discrete time WG schemes, and obtain their optimal convergence order estimates in a discrete $H^1$ norm for the displacement and in an $H^1$ type and $L^2$ norms for the pressure. Finally, numerical experiments are presented to illustrate the theoretical error results in different kinds of meshes which shows the WG flexibility for mesh selections, and to verify the locking-free property of our proposed method.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
A Transformer-based Generative Adversarial Network for Brain Tumor Segmentation
Authors:
Liqun Huang,
Long Chen,
Baihai Zhang,
Senchun Chai
Abstract:
Brain tumor segmentation remains a challenge in medical image segmentation tasks. With the application of transformer in various computer vision tasks, transformer blocks show the capability of learning long-distance dependency in global space, which is complementary with CNNs. In this paper, we proposed a novel transformer-based generative adversarial network to automatically segment brain tumors…
▽ More
Brain tumor segmentation remains a challenge in medical image segmentation tasks. With the application of transformer in various computer vision tasks, transformer blocks show the capability of learning long-distance dependency in global space, which is complementary with CNNs. In this paper, we proposed a novel transformer-based generative adversarial network to automatically segment brain tumors with multi-modalities MRI. Our architecture consists of a generator and a discriminator, which are trained in min-max game progress. The generator is based on a typical "U-shaped" encoder-decoder architecture, whose bottom layer is composed of transformer blocks with resnet. Besides, the generator is trained with deep supervision technology. The discriminator we designed is a CNN-based network with multi-scale $L_{1}$ loss, which is proved to be effective for medical semantic image segmentation. To validate the effectiveness of our method, we conducted experiments on BRATS2015 dataset, achieving comparable or better performance than previous state-of-the-art methods.
△ Less
Submitted 28 July, 2022; v1 submitted 28 July, 2022;
originally announced July 2022.
-
One-shot Neural Backdoor Erasing via Adversarial Weight Masking
Authors:
Shuwen Chai,
**ghui Chen
Abstract:
Recent studies show that despite achieving high accuracy on a number of real-world applications, deep neural networks (DNNs) can be backdoored: by injecting triggered data samples into the training dataset, the adversary can mislead the trained model into classifying any test data to the target class as long as the trigger pattern is presented. To nullify such backdoor threats, various methods hav…
▽ More
Recent studies show that despite achieving high accuracy on a number of real-world applications, deep neural networks (DNNs) can be backdoored: by injecting triggered data samples into the training dataset, the adversary can mislead the trained model into classifying any test data to the target class as long as the trigger pattern is presented. To nullify such backdoor threats, various methods have been proposed. Particularly, a line of research aims to purify the potentially compromised model. However, one major limitation of this line of work is the requirement to access sufficient original training data: the purifying performance is a lot worse when the available training data is limited. In this work, we propose Adversarial Weight Masking (AWM), a novel method capable of erasing the neural backdoors even in the one-shot setting. The key idea behind our method is to formulate this into a min-max optimization problem: first, adversarially recover the trigger patterns and then (soft) mask the network weights that are sensitive to the recovered patterns. Comprehensive evaluations of several benchmark datasets suggest that AWM can largely improve the purifying effects over other state-of-the-art methods on various available training dataset sizes.
△ Less
Submitted 1 November, 2022; v1 submitted 10 July, 2022;
originally announced July 2022.
-
The THUEE System Description for the IARPA OpenASR21 Challenge
Authors:
**g Zhao,
Haoyu Wang,
**peng Li,
Shuzhou Chai,
Guan-Bo Wang,
Guoguo Chen,
Wei-Qiang Zhang
Abstract:
This paper describes the THUEE team's speech recognition system for the IARPA Open Automatic Speech Recognition Challenge (OpenASR21), with further experiment explorations. We achieve outstanding results under both the Constrained and Constrained-plus training conditions. For the Constrained training condition, we construct our basic ASR system based on the standard hybrid architecture. To allevia…
▽ More
This paper describes the THUEE team's speech recognition system for the IARPA Open Automatic Speech Recognition Challenge (OpenASR21), with further experiment explorations. We achieve outstanding results under both the Constrained and Constrained-plus training conditions. For the Constrained training condition, we construct our basic ASR system based on the standard hybrid architecture. To alleviate the Out-Of-Vocabulary (OOV) problem, we extend the pronunciation lexicon using Grapheme-to-Phoneme (G2P) techniques for both OOV and potential new words. Standard acoustic model structures such as CNN-TDNN-F and CNN-TDNN-F-A are adopted. In addition, multiple data augmentation techniques are applied. For the Constrained-plus training condition, we use the self-supervised learning framework wav2vec2.0. We experiment with various fine-tuning techniques with the Connectionist Temporal Classification (CTC) criterion on top of the publicly available pre-trained model XLSR-53. We find that the frontend feature extractor plays an important role when applying the wav2vec2.0 pre-trained model to the encoder-decoder based CTC/Attention ASR architecture. Extra improvements can be achieved by using the CTC model finetuned in the target language as the frontend feature extractor.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction
Authors:
Yihan Hu,
Wenxin Shao,
Bo Jiang,
Jiajie Chen,
Siqi Chai,
Zhening Yang,
**gyu Qian,
Helong Zhou,
Qiang Liu
Abstract:
In this report, we introduce our solution to the Occupancy and Flow Prediction challenge in the Waymo Open Dataset Challenges at CVPR 2022, which ranks 1st on the leaderboard. We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a recursive hierarchical 3D decoder. We use multiple losse…
▽ More
In this report, we introduce our solution to the Occupancy and Flow Prediction challenge in the Waymo Open Dataset Challenges at CVPR 2022, which ranks 1st on the leaderboard. We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a recursive hierarchical 3D decoder. We use multiple losses including focal loss and modified flow trace loss to efficiently guide the training process. Our method achieves a Flow-Grounded Occupancy AUC of 0.8389 and outperforms all the other teams on the leaderboard.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Enhancing fiber atom interferometer by in-fiber laser cooling
Authors:
Yu Wang,
Shijie Chai,
Thomas Billotte,
Zilong Chen,
Mingjie Xin,
Wui Seng Leong,
Foued Amrani,
Benoit Debord,
Fetah Benabid,
Shau-Yu Lan
Abstract:
We demonstrate an inertia sensitive atom interferometer optically guided inside a 22-cm-long negative curvature hollow-core photonic crystal fiber with an interferometer time of 20 ms. The result prolongs the previous fiber guided atom interferometer time by three orders of magnitude. The improvement arises from the realization of in-fiber Λ-enhanced gray molasses and delta-kick cooling to cool at…
▽ More
We demonstrate an inertia sensitive atom interferometer optically guided inside a 22-cm-long negative curvature hollow-core photonic crystal fiber with an interferometer time of 20 ms. The result prolongs the previous fiber guided atom interferometer time by three orders of magnitude. The improvement arises from the realization of in-fiber Λ-enhanced gray molasses and delta-kick cooling to cool atoms from 32 μK to below 1 μK in 4 ms. The in-fiber cooling overcomes the inevitable heating during the atom loading process and allows a shallow guiding optical potential to minimize decoherence. Our results permit bringing atoms close to source fields for sensing and could lead to compact inertial quantum sensors with a sub-millimeter resolution.
△ Less
Submitted 19 December, 2021;
originally announced December 2021.
-
Hong Kong Air Traffic: Explanation and Prediction based on Sparse Seasonal ARIMA Model
Authors:
Shuwen Chai
Abstract:
The monthly air traffic of a city is a time series with an obvious seasonal pattern, and is closely related to the economic situation and social environment of the city. In Hong Kong, for example, July, August, and October tend to be the peak season of traffic flow, while there is also a relatively fixed off-season. In the case of a stable social environment, a carefully identified and fitted seas…
▽ More
The monthly air traffic of a city is a time series with an obvious seasonal pattern, and is closely related to the economic situation and social environment of the city. In Hong Kong, for example, July, August, and October tend to be the peak season of traffic flow, while there is also a relatively fixed off-season. In the case of a stable social environment, a carefully identified and fitted seasonal ARIMA model can predict the traffic flow in the future months well. This work selects the air traffic data, including arrival and departure passengers of Hong Kong, after the financial crisis and before the political storm. A sparse seasonal ARIMA$(0,1,1)\times(4,1,0)_{12}$ is built, which can correctly predict the air traffic from January to July in 2020 within its $95\%$ confidence interval. Furthermore, this work decomposes the time-series and find that important events, like the financial crisis, political storm, and the COVID-19 outbreak, affect the level of air traffic to some extent. For example, the political storm and epidemic prevention and control that happened after 2019 made the air traffic drop significantly. According to my sparse seasonal ARIMA model, the air traffic from February to November in 2020 is only $5\%$ of what it should be without these two events. This is a valuable application of the time-series model in the air traffic loss estimation.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Authors:
Guoguo Chen,
Shuzhou Chai,
Guanbo Wang,
Jiayu Du,
Wei-Qiang Zhang,
Chao Weng,
Dan Su,
Daniel Povey,
Jan Trmal,
Junbo Zhang,
Mingjie **,
Sanjeev Khudanpur,
Shinji Watanabe,
Shuaijiang Zhao,
Wei Zou,
Xiangang Li,
Xuchen Yao,
Yongqing Wang,
Yujun Wang,
Zhao You,
Zhiyong Yan
Abstract:
This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10,000 hours of high quality labeled audio suitable for supervised training, and 40,000 hours of total audio suitable for semi-supervised and unsupervised training. Around 40,000 hours of transcribed audio is first collected from audiobooks, podcasts and YouTube, covering both read and spontaneous sp…
▽ More
This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10,000 hours of high quality labeled audio suitable for supervised training, and 40,000 hours of total audio suitable for semi-supervised and unsupervised training. Around 40,000 hours of transcribed audio is first collected from audiobooks, podcasts and YouTube, covering both read and spontaneous speaking styles, and a variety of topics, such as arts, science, sports, etc. A new forced alignment and segmentation pipeline is proposed to create sentence segments suitable for speech recognition training, and to filter out segments with low-quality transcription. For system training, GigaSpeech provides five subsets of different sizes, 10h, 250h, 1000h, 2500h, and 10000h. For our 10,000-hour XL training subset, we cap the word error rate at 4% during the filtering/validation stage, and for all our other smaller training subsets, we cap it at 0%. The DEV and TEST evaluation sets, on the other hand, are re-processed by professional human transcribers to ensure high transcription quality. Baseline systems are provided for popular speech recognition toolkits, namely Athena, ESPnet, Kaldi and Pika.
△ Less
Submitted 13 June, 2021;
originally announced June 2021.
-
Quantization-Guided Training for Compact TinyML Models
Authors:
Sedigh Ghamari,
Koray Ozcan,
Thu Dinh,
Andrey Melnikov,
Juan Carvajal,
Jan Ernst,
Sek Chai
Abstract:
We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight values towards a distribution that maximizes accuracy while reducing quantization errors. One of the m…
▽ More
We propose a Quantization Guided Training (QGT) method to guide DNN training towards optimized low-bit-precision targets and reach extreme compression levels below 8-bit precision. Unlike standard quantization-aware training (QAT) approaches, QGT uses customized regularization to encourage weight values towards a distribution that maximizes accuracy while reducing quantization errors. One of the main benefits of this approach is the ability to identify compression bottlenecks. We validate QGT using state-of-the-art model architectures on vision datasets. We also demonstrate the effectiveness of QGT with an 81KB tiny model for person detection down to 2-bit precision (representing 17.7x size reduction), while maintaining an accuracy drop of only 3% compared to a floating-point baseline.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Dark-state sideband cooling in an atomic ensemble
Authors:
Chang Huang,
Shijie Chai,
Shau-Yu Lan
Abstract:
We utilize the dark state in a Λ-type three-level system to cool an ensemble of 85Rb atoms in an optical lattice [Morigi et al., Phys. Rev. Lett. 85, 4458 (2000)]. The common suppression of the carrier transition of atoms with different vibrational frequencies allows them to reach a subrecoil temperature of 100 nK after being released from the optical lattice. A nearly zero vibrational quantum num…
▽ More
We utilize the dark state in a Λ-type three-level system to cool an ensemble of 85Rb atoms in an optical lattice [Morigi et al., Phys. Rev. Lett. 85, 4458 (2000)]. The common suppression of the carrier transition of atoms with different vibrational frequencies allows them to reach a subrecoil temperature of 100 nK after being released from the optical lattice. A nearly zero vibrational quantum number is determined from the time-of-flight measurements and adiabatic expansion process. The features of sideband cooling are examined in various parameter spaces. Our results show that dark-state sideband cooling is a simple and compelling method for preparing a large ensemble of atoms into their vibrational ground state of a harmonic potential and can be generalized to different species of atoms and molecules for studying ultracold physics that demands recoil temperature and below.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
Reducing Textural Bias Improves Robustness of Deep Segmentation Models
Authors:
Seoin Chai,
Daniel Rueckert,
Ahmed E. Fetit
Abstract:
Despite advances in deep learning, robustness under domain shift remains a major bottleneck in medical imaging settings. Findings on natural images suggest that deep neural models can show a strong textural bias when carrying out image classification tasks. In this thorough empirical study, we draw inspiration from findings on natural images and investigate ways in which addressing the textural bi…
▽ More
Despite advances in deep learning, robustness under domain shift remains a major bottleneck in medical imaging settings. Findings on natural images suggest that deep neural models can show a strong textural bias when carrying out image classification tasks. In this thorough empirical study, we draw inspiration from findings on natural images and investigate ways in which addressing the textural bias phenomenon could bring up the robustness of deep segmentation models when applied to three-dimensional (3D) medical data. To achieve this, publicly available MRI scans from the Develo** Human Connectome Project are used to study ways in which simulating textural noise can help train robust models in a complex semantic segmentation task. We contribute an extensive empirical investigation consisting of 176 experiments and illustrate how applying specific types of simulated textural noise prior to training can lead to texture invariant models, resulting in improved robustness when segmenting scans corrupted by previously unseen noise types and levels.
△ Less
Submitted 27 June, 2021; v1 submitted 30 November, 2020;
originally announced November 2020.
-
Subtensor Quantization for Mobilenets
Authors:
Thu Dinh,
Andrey Melnikov,
Vasilios Daskalopoulos,
Sek Chai
Abstract:
Quantization for deep neural networks (DNN) have enabled developers to deploy models with less memory and more efficient low-power inference. However, not all DNN designs are friendly to quantization. For example, the popular Mobilenet architecture has been tuned to reduce parameter size and computational latency with separable depth-wise convolutions, but not all quantization algorithms work well…
▽ More
Quantization for deep neural networks (DNN) have enabled developers to deploy models with less memory and more efficient low-power inference. However, not all DNN designs are friendly to quantization. For example, the popular Mobilenet architecture has been tuned to reduce parameter size and computational latency with separable depth-wise convolutions, but not all quantization algorithms work well and the accuracy can suffer against its float point versions. In this paper, we analyzed several root causes of quantization loss and proposed alternatives that do not rely on per-channel or training-aware approaches. We evaluate the image classification task on ImageNet dataset, and our post-training quantized 8-bit inference top-1 accuracy in within 0.7% of the floating point version.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Dynamically Throttleable Neural Networks (TNN)
Authors:
Hengyue Liu,
Samyak Parajuli,
Jesse Hostetler,
Sek Chai,
Bir Bhanu
Abstract:
Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network. In this work, we present a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources. We designed TNN with several properties that enable more flexibility for dynamic execution b…
▽ More
Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network. In this work, we present a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources. We designed TNN with several properties that enable more flexibility for dynamic execution based on runtime context. TNNs are defined as throttleable modules gated with a separately trained controller that generates a single utilization control parameter. We validate our proposal on a number of experiments, including Convolution Neural Networks (CNNs such as VGG, ResNet, ResNeXt, DenseNet) using CiFAR-10 and ImageNet dataset, for object classification and recognition tasks. We also demonstrate the effectiveness of dynamic TNN execution on a 3D Convolustion Network (C3D) for a hand gesture task. Results show that TNN can maintain peak accuracy performance compared to vanilla solutions, while providing a graceful reduction in computational requirement, down to 74% reduction in latency and 52% energy savings.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.
-
Large array of Schrödinger cat states facilitated by an optical waveguide
Authors:
Wui Seng Leong,
Mingjie Xin,
Zilong Chen,
Shijie Chai,
Yu Wang,
Shau-Yu Lan
Abstract:
Quantum engineering using photonic structures offer new capabilities for atom-photon interactions for quantum optics and atomic physics, which could eventually lead to integrated quantum devices. Despite the rapid progress in the variety of structures, coherent excitation of the motional states of atoms in a photonic waveguide using guided modes has yet to be demonstrated. Here, we use the wavegui…
▽ More
Quantum engineering using photonic structures offer new capabilities for atom-photon interactions for quantum optics and atomic physics, which could eventually lead to integrated quantum devices. Despite the rapid progress in the variety of structures, coherent excitation of the motional states of atoms in a photonic waveguide using guided modes has yet to be demonstrated. Here, we use the waveguide mode of a hollow-core photonic crystal fibre to manipulate the mechanical Fock states of single atoms in a harmonic potential inside the fibre. We create a large array of Schrödinger cat states, a quintessential feature of quantum physics and a key element in quantum information processing and metrology, of approximately 15000 atoms along the fibre by entangling the electronic state with the coherent harmonic oscillator state of each individual atom. Our results provide a useful step for quantum information and simulation with a wide range of photonic waveguide systems.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Causal Mechanism Transfer Network for Time Series Domain Adaptation in Mechanical Systems
Authors:
Zijian Li,
Ruichu Cai,
Kok Soon Chai,
Hong Wei Ng,
Hoang Dung Vu,
Marianne Winslett,
Tom Z. J. Fu,
Boyan Xu,
Xiaoyan Yang,
Zhenjie Zhang
Abstract:
Data-driven models are becoming essential parts in modern mechanical systems, commonly used to capture the behavior of various equipment and varying environmental characteristics. Despite the advantages of these data-driven models on excellent adaptivity to high dynamics and aging equipment, they are usually hungry to massive labels over historical data, mostly contributed by human engineers at an…
▽ More
Data-driven models are becoming essential parts in modern mechanical systems, commonly used to capture the behavior of various equipment and varying environmental characteristics. Despite the advantages of these data-driven models on excellent adaptivity to high dynamics and aging equipment, they are usually hungry to massive labels over historical data, mostly contributed by human engineers at an extremely high cost. The label demand is now the major limiting factor to modeling accuracy, hindering the fulfillment of visions for applications. Fortunately, domain adaptation enhances the model generalization by utilizing the labelled source data as well as the unlabelled target data and then we can reuse the model on different domains. However, the mainstream domain adaptation methods cannot achieve ideal performance on time series data, because most of them focus on static samples and even the existing time series domain adaptation methods ignore the properties of time series data, such as temporal causal mechanism. In this paper, we assume that causal mechanism is invariant and present our Causal Mechanism Transfer Network(CMTN) for time series domain adaptation. By capturing and transferring the dynamic and temporal causal mechanism of multivariate time series data and alleviating the time lags and different value ranges among different machines, CMTN allows the data-driven models to exploit existing data and labels from similar systems, such that the resulting model on a new system is highly reliable even with very limited data. We report our empirical results and lessons learned from two real-world case studies, on chiller plant energy optimization and boiler fault detection, which outperforms the existing state-of-the-art method.
△ Less
Submitted 13 October, 2019;
originally announced October 2019.
-
Bit Efficient Quantization for Deep Neural Networks
Authors:
Prateeth Nayak,
David Zhang,
Sek Chai
Abstract:
Quantization for deep neural networks have afforded models for edge devices that use less on-board memory and enable efficient low-power inference. In this paper, we present a comparison of model-parameter driven quantization approaches that can achieve as low as 3-bit precision without affecting accuracy. The post-training quantization approaches are data-free, and the resulting weight values are…
▽ More
Quantization for deep neural networks have afforded models for edge devices that use less on-board memory and enable efficient low-power inference. In this paper, we present a comparison of model-parameter driven quantization approaches that can achieve as low as 3-bit precision without affecting accuracy. The post-training quantization approaches are data-free, and the resulting weight values are closely tied to the dataset distribution on which the model has converged to optimality. We show quantization results for a number of state-of-art deep neural networks (DNN) using large dataset like ImageNet. To better analyze quantization results, we describe the overall range and local sparsity of values afforded through various quantization schemes. We show the methods to lower bit-precision beyond quantization limits with object class clustering.
△ Less
Submitted 7 October, 2019;
originally announced October 2019.
-
Voting for Distortion Points in Geometric Processing
Authors:
Shuangming Chai,
Xiao-Ming Fu,
Ligang Liu
Abstract:
Low isometric distortion is often required for mesh parameterizations. A configuration of some vertices, where the distortion is concentrated, provides a way to mitigate isometric distortion, but determining the number and placement of these vertices is non-trivial. We call these vertices distortion points. We present a novel and automatic method to detect distortion points using a voting strategy…
▽ More
Low isometric distortion is often required for mesh parameterizations. A configuration of some vertices, where the distortion is concentrated, provides a way to mitigate isometric distortion, but determining the number and placement of these vertices is non-trivial. We call these vertices distortion points. We present a novel and automatic method to detect distortion points using a voting strategy. Our method integrates two components: candidate generation and candidate voting. Given a closed triangular mesh, we generate candidate distortion points by executing a three-step procedure repeatedly: (1) randomly cut an input to a disk topology; (2) compute a low conformal distortion parameterization; and (3) detect the distortion points. Finally, we count the candidate points and generate the final distortion points by voting. We demonstrate that our algorithm succeeds when employed on various closed meshes with a genus of zero or higher. The distortion points generated by our method are utilized in three applications, including planar parameterization, semi-automatic landmark correspondence, and isotropic remeshing. Compared to other state-of-the-art methods, our method demonstrates stronger practical robustness in distortion point detection.
△ Less
Submitted 1 November, 2019; v1 submitted 28 September, 2019;
originally announced September 2019.
-
Generative Memory for Lifelong Reinforcement Learning
Authors:
Aswin Raghavan,
Jesse Hostetler,
Sek Chai
Abstract:
Our research is focused on understanding and applying biological memory transfers to new AI systems that can fundamentally improve their performance, throughout their fielded lifetime experience. We leverage current understanding of biological memory transfer to arrive at AI algorithms for memory consolidation and replay. In this paper, we propose the use of generative memory that can be recalled…
▽ More
Our research is focused on understanding and applying biological memory transfers to new AI systems that can fundamentally improve their performance, throughout their fielded lifetime experience. We leverage current understanding of biological memory transfer to arrive at AI algorithms for memory consolidation and replay. In this paper, we propose the use of generative memory that can be recalled in batch samples to train a multi-task agent in a pseudo-rehearsal manner. We show results motivating the need for task-agnostic separation of latent space for the generative memory to address issues of catastrophic forgetting in lifelong learning.
△ Less
Submitted 21 February, 2019;
originally announced February 2019.
-
Beam test performance of the highly granular SiW-ECAL technological prototype for the ILC
Authors:
K. Kawagoe,
Y. Miura,
I. Sekiya,
T. Suehara,
T. Yoshioka,
S. Bilokin,
J. Bonis,
P. Cornebise,
A. Gallas,
A. Irles,
R. Pöschl,
F. Richard,
A. Thiebault,
D. Zerwas,
M. Anduze,
V. Balagura,
V. Boudry,
J-C. Brient,
E. Edy,
G. Fayolle,
M. Frotin,
F. Gastaldi,
R. Guillaumat,
A. Lobanov,
M. Louzir
, et al. (19 additional authors not shown)
Abstract:
The technological prototype of the CALICE highly granular silicon-tungsten electromagnetic calorimeter (SiW-ECAL) was tested in a beam at DESY in 2017. The setup comprised seven layers of silicon sensors. Each layer comprised four sensors, with each sensor containing an array of 256 $5.5\times5.5$ mm$^2$ silicon PIN diodes. The four sensors covered a total area of $18\times18$ cm$^2$, and comprise…
▽ More
The technological prototype of the CALICE highly granular silicon-tungsten electromagnetic calorimeter (SiW-ECAL) was tested in a beam at DESY in 2017. The setup comprised seven layers of silicon sensors. Each layer comprised four sensors, with each sensor containing an array of 256 $5.5\times5.5$ mm$^2$ silicon PIN diodes. The four sensors covered a total area of $18\times18$ cm$^2$, and comprised a total of 1024 channels. The readout was split into a trigger line and a charge signal line. Key performance results for signal over noise for the two output lines are presented, together with a study of the uniformity of the detector response. Measurements of the response to electrons for the tungsten loaded version of the detector are also presented.
△ Less
Submitted 22 October, 2019; v1 submitted 31 January, 2019;
originally announced February 2019.
-
Data Driven Chiller Plant Energy Optimization with Domain Knowledge
Authors:
Hoang Dung Vu,
Kok Soon Chai,
Bryan Keating,
Nurislam Tursynbek,
Boyan Xu,
Kaige Yang,
Xiaoyan Yang,
Zhenjie Zhang
Abstract:
Refrigeration and chiller optimization is an important and well studied topic in mechanical engineering, mostly taking advantage of physical models, designed on top of over-simplified assumptions, over the equipments. Conventional optimization techniques using physical models make decisions of online parameter tuning, based on very limited information of hardware specifications and external condit…
▽ More
Refrigeration and chiller optimization is an important and well studied topic in mechanical engineering, mostly taking advantage of physical models, designed on top of over-simplified assumptions, over the equipments. Conventional optimization techniques using physical models make decisions of online parameter tuning, based on very limited information of hardware specifications and external conditions, e.g., outdoor weather. In recent years, new generation of sensors is becoming essential part of new chiller plants, for the first time allowing the system administrators to continuously monitor the running status of all equipments in a timely and accurate way. The explosive growth of data flowing to databases, driven by the increasing analytical power by machine learning and data mining, unveils new possibilities of data-driven approaches for real-time chiller plant optimization. This paper presents our research and industrial experience on the adoption of data models and optimizations on chiller plant and discusses the lessons learnt from our practice on real world plants. Instead of employing complex machine learning models, we emphasize the incorporation of appropriate domain knowledge into data analysis tools, which turns out to be the key performance improver over state-of-the-art deep learning techniques by a significant margin. Our empirical evaluation on a real world chiller plant achieves savings by more than 7% on daily power consumption.
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
Bootstrap** Deep Neural Networks from Approximate Image Processing Pipelines
Authors:
Kilho Son,
Jesse Hostetler,
Sek Chai
Abstract:
Complex image processing and computer vision systems often consist of a processing pipeline of functional modules. We intend to replace parts or all of a target pipeline with deep neural networks to achieve benefits such as increased accuracy or reduced computational requirement. To acquire a large amount of labeled data necessary to train the deep neural network, we propose a workflow that levera…
▽ More
Complex image processing and computer vision systems often consist of a processing pipeline of functional modules. We intend to replace parts or all of a target pipeline with deep neural networks to achieve benefits such as increased accuracy or reduced computational requirement. To acquire a large amount of labeled data necessary to train the deep neural network, we propose a workflow that leverages the target pipeline to create a significantly larger labeled training set automatically, without prior domain knowledge of the target pipeline. We show experimentally that despite the noise introduced by automated labeling and only using a very small initially labeled data set, the trained deep neural networks can achieve similar or even better performance than the components they replace, while in some cases also reducing computational requirements.
△ Less
Submitted 15 February, 2019; v1 submitted 29 November, 2018;
originally announced November 2018.
-
Generalized Ternary Connect: End-to-End Learning and Compression of Multiplication-Free Deep Neural Networks
Authors:
Samyak Parajuli,
Aswin Raghavan,
Sek Chai
Abstract:
The use of deep neural networks in edge computing devices hinges on the balance between accuracy and complexity of computations. Ternary Connect (TC) \cite{lin2015neural} addresses this issue by restricting the parameters to three levels $-1, 0$, and $+1$, thus eliminating multiplications in the forward pass of the network during prediction. We propose Generalized Ternary Connect (GTC), which allo…
▽ More
The use of deep neural networks in edge computing devices hinges on the balance between accuracy and complexity of computations. Ternary Connect (TC) \cite{lin2015neural} addresses this issue by restricting the parameters to three levels $-1, 0$, and $+1$, thus eliminating multiplications in the forward pass of the network during prediction. We propose Generalized Ternary Connect (GTC), which allows an arbitrary number of levels while at the same time eliminating multiplications by restricting the parameters to integer powers of two. The primary contribution is that GTC learns the number of levels and their values for each layer, jointly with the weights of the network in an end-to-end fashion. Experiments on MNIST and CIFAR-10 show that GTC naturally converges to an `almost binary' network for deep classification networks (e.g. VGG-16) and deep variational auto-encoders, with negligible loss of classification accuracy and comparable visual quality of generated samples respectively. We demonstrate superior compression and similar accuracy of GTC in comparison to several state-of-the-art methods for neural network compression. We conclude with simulations showing the potential benefits of GTC in hardware.
△ Less
Submitted 12 November, 2018;
originally announced November 2018.
-
Commissioning of the highly granular SiW-ECAL technological prototype
Authors:
S. Bilokin,
J. Bonis,
P. Cornebise,
A. Gallas,
A. Irles,
R. Pöschl,
F. Richard,
A. Thiebault,
D. Zerwas,
M. Anduze,
V. Balagura,
V. Boudry,
J-C. Brient,
E. Edy,
G. Fayolle,
M. Frotin,
F. Gastaldi,
A. Lobanov,
F. Magniette,
J. Nanni,
M. Rubio-Roy,
K. Shpak,
H. Videau,
D. Yu,
S. Callier
, et al. (18 additional authors not shown)
Abstract:
In this article we describe the commissioning and a first analysis of the the beam test performance of a small prototype of a highly granular silicon tungsten calorimeter. The prototype features detector elements with a channel number similar to that envisaged for e.g. the ILD Detector of the International Linear Collider (ILC). The analysis demonstrates the capability of the detector to record si…
▽ More
In this article we describe the commissioning and a first analysis of the the beam test performance of a small prototype of a highly granular silicon tungsten calorimeter. The prototype features detector elements with a channel number similar to that envisaged for e.g. the ILD Detector of the International Linear Collider (ILC). The analysis demonstrates the capability of the detector to record signals as low as 0.5 MIP. Further, no loss of performance has been observed when operating the detector in a high magnetic field.
△ Less
Submitted 4 April, 2019; v1 submitted 11 October, 2018;
originally announced October 2018.
-
Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep Learning
Authors:
Zecheng He,
Aswin Raghavan,
Guangyuan Hu,
Sek Chai,
Ruby Lee
Abstract:
Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have un…
▽ More
Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have unknown behavior. Furthermore, if data collected from the controller is transferred to a server through networks for analysis and detection of anomalous behavior, this creates a very large attack surface and also delays detection.
In order to address this problem, we propose Reconstruction Error Distribution (RED) of Hardware Performance Counters (HPCs), and a data-driven defense system based on it. Specifically, we first train a temporal deep learning model, using only normal HPC readings from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller. Then, we run this model using real-time data from commonly available HPCs. We use the proposed RED to enhance the temporal deep learning detection of anomalous behavior, by estimating distribution deviations from the normal behavior with an effective statistical test. Experimental results on a real power-grid controller show that we can detect anomalous behavior with high accuracy (>99.9%), nearly zero false positives and short (<360ms) latency.
△ Less
Submitted 22 June, 2019; v1 submitted 18 June, 2018;
originally announced June 2018.
-
Electromagnon in Y-type hexaferrite BaSrCoZnFe$_{11}$AlO$_{22}$
Authors:
Jakub Vit,
Filip Kadlec,
Christelle Kadlec,
Fedir Borodavka,
Yi Sheng Chai,
Kun Zhai,
Young Sun,
Stanislav Kamba
Abstract:
We investigated static and dynamic magnetoelectric properties of single crystalline BaSrCoZnFe$_{11}$AlO$_{22}$ which is a room-temperature multiferroic with Y-type hexaferrite crystal structure. Below $300\,\rm K$, a purely electric-dipole-active electromagnon at $\approx 1.2\,\rm THz$ with the electric polarization oscillating along the hexagonal axis was observed by THz and Raman spectroscopies…
▽ More
We investigated static and dynamic magnetoelectric properties of single crystalline BaSrCoZnFe$_{11}$AlO$_{22}$ which is a room-temperature multiferroic with Y-type hexaferrite crystal structure. Below $300\,\rm K$, a purely electric-dipole-active electromagnon at $\approx 1.2\,\rm THz$ with the electric polarization oscillating along the hexagonal axis was observed by THz and Raman spectroscopies. We investigated the behavior of the electromagnon with applied DC magnetic field and linked its properties to static measurements of the magnetic structure. Our analytical calculations determined selection rules for electromagnons activated by the magnetostriction mechanism in various magnetic structures of Y-type hexaferrite. Comparison with our experiment supports that the electromagnon is indeed activated by the magnetostriction mechanism involving spin vibrations along the hexagonal axis.
△ Less
Submitted 9 April, 2018;
originally announced April 2018.
-
Contrasting magnetoelectric behavior in multiferroic hexaferrites as understood by crystal symmetry analyses
Authors:
Y. S. Chai,
S. H. Chun,
J. Z. Cong,
Kee Hoon Kim
Abstract:
Magnetoelectric (ME) properties under rotating magnetic field H are comparatively investigated in two representative hexaferrites Y-type Ba0.5Sr1.5Zn2(Fe0.92Al0.08)12O22 and Z-type Ba0.52Sr2.48Co2Fe24O41, both of which have exhibited a similar transverse conical spin structure and giant ME coupling near room temperature. When the external H is rotated clockwise by 2pi, in-plane P vector is rotated…
▽ More
Magnetoelectric (ME) properties under rotating magnetic field H are comparatively investigated in two representative hexaferrites Y-type Ba0.5Sr1.5Zn2(Fe0.92Al0.08)12O22 and Z-type Ba0.52Sr2.48Co2Fe24O41, both of which have exhibited a similar transverse conical spin structure and giant ME coupling near room temperature. When the external H is rotated clockwise by 2pi, in-plane P vector is rotated clockwise by 2pi in the Y-type hexaferrite and counterclockwise by 4pi in the Z-type hexaferrite. A symmetry-based analysis reveals that the faster and opposite rotation of P vector in the Z-type hexaferrite is associated with the existence of a mirror plane perpendicular to c-axis. Moreover, such a peculiar crystal symmetry also results in contrasting microscopic origins for the spin-driven ferroelectricity; only the inverse DM interaction is responsible for the Y-type hexaferrite while additional p-d hybridization becomes more important in the Z-type hexaferrite. This work demonstrates the importance of the crystal symmetry in the determination of ME properties in the hexaferrites and provides a fundamental framework for understanding and applying the giant ME coupling in various ferrites with hexagonal crystal structure.
△ Less
Submitted 16 December, 2017;
originally announced December 2017.
-
BitNet: Bit-Regularized Deep Neural Networks
Authors:
Aswin Raghavan,
Mohamed Amer,
Sek Chai,
Graham Taylor
Abstract:
We present a novel optimization strategy for training neural networks which we call "BitNet". The parameters of neural networks are usually unconstrained and have a dynamic range dispersed over all real values. Our key idea is to limit the expressive power of the network by dynamically controlling the range and set of values that the parameters can take. We formulate this idea using a novel end-to…
▽ More
We present a novel optimization strategy for training neural networks which we call "BitNet". The parameters of neural networks are usually unconstrained and have a dynamic range dispersed over all real values. Our key idea is to limit the expressive power of the network by dynamically controlling the range and set of values that the parameters can take. We formulate this idea using a novel end-to-end approach that circumvents the discrete parameter space by optimizing a relaxed continuous and differentiable upper bound of the typical classification loss function. The approach can be interpreted as a regularization inspired by the Minimum Description Length (MDL) principle. For each layer of the network, our approach optimizes real-valued translation and scaling factors and arbitrary precision integer-valued parameters (weights). We empirically compare BitNet to an equivalent unregularized model on the MNIST and CIFAR-10 datasets. We show that BitNet converges faster to a superior quality solution. Additionally, the resulting model has significant savings in memory due to the use of integer-valued parameters.
△ Less
Submitted 16 November, 2018; v1 submitted 16 August, 2017;
originally announced August 2017.
-
GPU Activity Prediction using Representation Learning
Authors:
Aswin Raghavan,
Mohamed Amer,
Timothy Shields,
David Zhang,
Sek Chai
Abstract:
GPU activity prediction is an important and complex problem. This is due to the high level of contention among thousands of parallel threads. This problem was mostly addressed using heuristics. We propose a representation learning approach to address this problem. We model any performance metric as a temporal function of the executed instructions with the intuition that the flow of instructions ca…
▽ More
GPU activity prediction is an important and complex problem. This is due to the high level of contention among thousands of parallel threads. This problem was mostly addressed using heuristics. We propose a representation learning approach to address this problem. We model any performance metric as a temporal function of the executed instructions with the intuition that the flow of instructions can be identified as distinct activities of the code. Our experiments show high accuracy and non-trivial predictive power of representation learning on a benchmark.
△ Less
Submitted 27 March, 2017;
originally announced March 2017.
-
Low Precision Neural Networks using Subband Decomposition
Authors:
Sek Chai,
Aswin Raghavan,
David Zhang,
Mohamed Amer,
Tim Shields
Abstract:
Large-scale deep neural networks (DNN) have been successfully used in a number of tasks from image recognition to natural language processing. They are trained using large training sets on large models, making them computationally and memory intensive. As such, there is much interest in research development for faster training and test time. In this paper, we present a unique approach using lower…
▽ More
Large-scale deep neural networks (DNN) have been successfully used in a number of tasks from image recognition to natural language processing. They are trained using large training sets on large models, making them computationally and memory intensive. As such, there is much interest in research development for faster training and test time. In this paper, we present a unique approach using lower precision weights for more efficient and faster training phase. We separate imagery into different frequency bands (e.g. with different information content) such that the neural net can better learn using less bits. We present this approach as a complement existing methods such as pruning network connections and encoding learning weights. We show results where this approach supports more stable learning with 2-4X reduction in precision with 17X reduction in DNN parameters.
△ Less
Submitted 24 March, 2017;
originally announced March 2017.
-
Observation of Toroidal Alfven Eigenmodes during Minor Disruptions in Ohmic Plasmas
Authors:
Yangqing Liu,
Yi Tan,
Zhe Gao,
Yuhong Xu,
Youjun Hu,
Song Chai,
Yanzheng Jiang,
Rui Ke,
Heng Zhong,
Wenhao Wang
Abstract:
Toroidal Alfven eigenmodes (TAEs) excited in purely ohmically heated plasmas without any auxiliary heating have been identified for the first time in the SUNIST spherical tokamak. The TAE modes are observed during minor disruptions and have a frequency range of 150-500 kHz. The mode structure analysis indicates the existence of both m/n=-3/-1 and -4/-1 harmonics, propagating in the electron diamag…
▽ More
Toroidal Alfven eigenmodes (TAEs) excited in purely ohmically heated plasmas without any auxiliary heating have been identified for the first time in the SUNIST spherical tokamak. The TAE modes are observed during minor disruptions and have a frequency range of 150-500 kHz. The mode structure analysis indicates the existence of both m/n=-3/-1 and -4/-1 harmonics, propagating in the electron diamagnetic direction in the laboratory frame of reference. These TAEs appear simultaneously with the generation of runaway electrons in the current quench phase, accompanying with the density swee** during the minor disruption. Possible driving mechanisms and potential applications of these TAEs are discussed.
△ Less
Submitted 13 December, 2016;
originally announced December 2016.
-
Resonant transfer of large momenta from finite duration pulse sequences
Authors:
Julia Fekete,
Shijie Chai,
Simon A. Gardiner,
Mikkel F. Andersen
Abstract:
We experimentally investigate the atom optics kicked particle at quantum resonance using finite duration kicks. Even though the underlying process is quantum interference it can be well described by an $ε$-pseudoclassical model. The $ε$-pseudoclassical model agrees well with our experiments for a wide range of parameters. We investigate the parameters yielding maximal momentum transfer to the atom…
▽ More
We experimentally investigate the atom optics kicked particle at quantum resonance using finite duration kicks. Even though the underlying process is quantum interference it can be well described by an $ε$-pseudoclassical model. The $ε$-pseudoclassical model agrees well with our experiments for a wide range of parameters. We investigate the parameters yielding maximal momentum transfer to the atoms and find that this occurs in the regime where neither the short pulse approximation nor the Bragg condition is valid. Nonetheless, the momentum transferred to the atoms can be predicted using a simple scaling law, which provides a powerful tool for choosing optimal experimental parameters. We demonstrate this in a measurement of the Talbot time (from which $h/M$ can be deduced), in which we coherently split atomic wave-functions into superpositions of momentum states that differ by 200 photon recoils. Our work may provide a convenient way to implement large momentum difference beam splitters in atom interferometers.
△ Less
Submitted 1 February, 2017; v1 submitted 26 September, 2016;
originally announced September 2016.
-
Electromagnon in the Z-type hexaferrite $({\rm Ba}_{x}{\rm Sr}_{1-x})_3\rm Co_2Fe_{24}O_{41}$
Authors:
Filip Kadlec,
Christelle Kadlec,
Jakub Vit,
Fedir Borodavka,
Martin Kempa,
Jan Prokleska,
Josef Bursik,
Robert Uhrecky,
Stephane Rols,
Yi Sheng Chai,
Kun Zhai,
Young Sun,
Jan Drahokoupil,
Veronica Goian,
Stanislav Kamba
Abstract:
We studied experimentally the high-temperature magnetoelectric $({\rm Ba}_{x}{\rm Sr}_{1-x})_3\rm Co_2Fe_{24}O_{41}$ prepared as ceramics (x = 0, 0.2) and a single crystal (x = 0.5) using inelastic neutron scattering, THz time-domain, Raman and far-infrared spectroscopies. The spectra, measured with varying temperature and magnetic field, reveal rich information about the collective spin and latti…
▽ More
We studied experimentally the high-temperature magnetoelectric $({\rm Ba}_{x}{\rm Sr}_{1-x})_3\rm Co_2Fe_{24}O_{41}$ prepared as ceramics (x = 0, 0.2) and a single crystal (x = 0.5) using inelastic neutron scattering, THz time-domain, Raman and far-infrared spectroscopies. The spectra, measured with varying temperature and magnetic field, reveal rich information about the collective spin and lattice excitations. In the ceramics, we observed an infrared-active magnon which is absent in $E^ω\perp z$ polarized THz spectra of the crystal, and we assume that it is an electromagnon active in $E^ω \| z$ polarized spectra. On heating from 7 to 250 K, the frequency of this electromagnon drops from 36 to 25 cm$^{-1}$ and its dam** gradually increases, so it becomes overdamped at room temperature. Applying external magnetic field has a similar effect on the dam** and frequency of the electromagnon, and the mode is no more observable in the THz spectra above 2 T, as the transverse-conical magnetic structure transforms into a collinear one. Raman spectra reveal another spin excitation with a slightly different frequency and much higher dam**. Upon applying magnetic field higher than 3 T, in the low-frequency part of the THz spectra, a narrow excitation appears whose frequency linearly increases with magnetic field. We interpret this feature as the ferromagnetic resonance.
△ Less
Submitted 20 July, 2016;
originally announced July 2016.
-
Intelligent Low-level RF System by Non-destructive Beam Monitoring Device for Cyclotrons
Authors:
M. S. Sharifi Asadi Malafeh,
M. Ghergherehchi,
H. Afarideh,
J. S. Chai
Abstract:
The project of a10MeV PET cyclotron accelerator for medical diagnosis and treatment was started at Amirkabir University of Technology in 2012. The low-level RF system of cyclotron accelerator is designed to stabilize acceleration voltage and control the resonance frequency of the cavity. In this work Intelligent Low Level Radio Frequency Circuit or ILLRF suitable for Most of the AVF cyclotron acce…
▽ More
The project of a10MeV PET cyclotron accelerator for medical diagnosis and treatment was started at Amirkabir University of Technology in 2012. The low-level RF system of cyclotron accelerator is designed to stabilize acceleration voltage and control the resonance frequency of the cavity. In this work Intelligent Low Level Radio Frequency Circuit or ILLRF suitable for Most of the AVF cyclotron accelerators was designed by the beam monitoring device and narrow band tunable band-pass filter. In this design, for the RF phase detection does not need to signal processing by microcontroller
△ Less
Submitted 20 April, 2015;
originally announced April 2015.
-
Magnetic domain-wall motion twisted by nanoscale probe-induced spin transfer
Authors:
J. Wang,
L. S. Xie,
C. S. Wang,
H. Z. Zhang,
L. Shu,
J. Bai,
Y. S. Chai,
X. Zhao,
J. C. Nie,
C. B. Cao,
C. Z. Gu,
C. M. Xiong,
Y. Sun,
J. Shi,
S. Salahuddin,
K. Xia,
C. W. Nan,
J. X. Zhang
Abstract:
A method for deterministic control of the magnetic order parameter using an electrical stimulus is highly desired for the new generation of spintronic and magnetoelectronic devices. Much effort has been focused on magnetic domain-wall motion manipulated by a successive injection of spin-polarized current into a magnetic nanostructure. However, an integrant high-threshold current density of 107~108…
▽ More
A method for deterministic control of the magnetic order parameter using an electrical stimulus is highly desired for the new generation of spintronic and magnetoelectronic devices. Much effort has been focused on magnetic domain-wall motion manipulated by a successive injection of spin-polarized current into a magnetic nanostructure. However, an integrant high-threshold current density of 107~108 A/cm2 inhibits the integration of those nanostructures with low-energy-cost technology. In addition, a precise determination of the location of domain walls at nanoscale seems difficult in artificially manufactured nanostructures. Here we report an approach to manipulate a single magnetic domain wall with a perpendicular anisotropy in a manganite/dielectric/metal capacitor using a probe-induced spin displacement. A spin angular momentum transfer torque occurs in the strongly correlated manganite film during the spin injection into the capacitor from the nanoscale magnetized tip with an ultralow voltage of 0.1 V, where the threshold spin-polarized current density is ~104 A/cm2 at the tip/manganite interface. The probe-voltage-controlled domain wall motion in the capacitor demonstrates a critical framework for the fundamental understanding of the manipulation of the nano-magnet systems with low energy consumption.
△ Less
Submitted 10 July, 2014;
originally announced July 2014.