-
Masked-attention Mask Transformer for Universal Image Segmentation
Authors:
Bowen Cheng,
Ishan Misra,
Alexander G. Schwing,
Alexander Kirillov,
Rohit Girdhar
Abstract:
Image segmentation is about grou** pixels with different semantics, e.g., category or instance membership, where each choice of semantics defines a task. While only the semantics of each task differ, current research focuses on designing specialized architectures for each task. We present Masked-attention Mask Transformer (Mask2Former), a new architecture capable of addressing any image segmenta…
▽ More
Image segmentation is about grou** pixels with different semantics, e.g., category or instance membership, where each choice of semantics defines a task. While only the semantics of each task differ, current research focuses on designing specialized architectures for each task. We present Masked-attention Mask Transformer (Mask2Former), a new architecture capable of addressing any image segmentation task (panoptic, instance or semantic). Its key components include masked attention, which extracts localized features by constraining cross-attention within predicted mask regions. In addition to reducing the research effort by at least three times, it outperforms the best specialized architectures by a significant margin on four popular datasets. Most notably, Mask2Former sets a new state-of-the-art for panoptic segmentation (57.8 PQ on COCO), instance segmentation (50.1 AP on COCO) and semantic segmentation (57.7 mIoU on ADE20K).
△ Less
Submitted 15 June, 2022; v1 submitted 2 December, 2021;
originally announced December 2021.
-
High-pressure phase behaviors of titanium dioxide revealed by a $Δ$-learning potential
Authors:
Jacob G. Lee,
Chris J. Pickard,
Bingqing Cheng
Abstract:
Titanium dioxide has been extensively studied in the rutile or anatase phases, while its high-pressure phases are less well understood, despite that many are thought to have interesting optical, mechanical and electrochemical properties. First-principles methods such as density functional theory (DFT) are often used to compute the enthalpies of TiO$_2$ phases at 0~K, but they are expensive and thu…
▽ More
Titanium dioxide has been extensively studied in the rutile or anatase phases, while its high-pressure phases are less well understood, despite that many are thought to have interesting optical, mechanical and electrochemical properties. First-principles methods such as density functional theory (DFT) are often used to compute the enthalpies of TiO$_2$ phases at 0~K, but they are expensive and thus impractical for long time-scale and large system-size simulations at finite temperatures. On the other hand, cheap empirical potentials fail to capture the relative stablities of the various polymorphs. To model the thermodynamic behaviors of ambient and high-pressure phases of TiO$_2$, we design an empirical model as a baseline, and then train a machine learning potential based on the difference between the DFT data and the empirical model. This so-called $Δ$-learning potential contains long-range electrostatic interactions, and predicts the 0~K enthalpies of stable TiO$_2$ phases that are in good agreement with DFT. We construct a pressure-temperature phase diagram of TiO$_2$ in the range $0<P<70$~GPa and $100<T<1500$~K. We then simulate dynamic phase transition processes, by compressing anatase at different temperatures. At 300~K, we observe predominantly anatase-to-baddeleyite transformation at about 20~GPa, via a martensitic two-step mechanism with highly ordered and collective atomic motion. At 2000~K, anatase can transform into cotunnite around 45-55~GPa in a thermally-activated and probabilistic manner, accompanied by diffusive movement of oxygen atoms. The pressures computed for these transitions show good agreement with experiments. Our results shed light on how to synthesize and stabilize high-pressure TiO$_2$ phases, and our method is generally applicable to other functional materials with multiple polymorphs.
△ Less
Submitted 25 November, 2021;
originally announced November 2021.
-
Development of a GPU-accelerated Monte Carlo dose calculation module for nuclear medicine, ARCHER-NM: Demonstration for a PET/CT imaging procedure
Authors:
Zhao Peng,
Yu Lu,
Yao Xu,
Yongzhe Li,
Bo Cheng,
Ming Ni,
Zhi Chen,
Xi Pei,
Qiang Xie,
Shicun Wang,
X. George Xu
Abstract:
This paper describes the development and validation of a Monte Carlo (MC) dose computing module dedicated to organ dose calculations of patients undergoing nuclear medicine (NM) internal radiation exposures involving 18F-FDG PET/CT examination. This new module extends the more-than-10-years-long ARCHER project that developed a GPU-accelerated MC dose engine by adding dedicated NM source-definition…
▽ More
This paper describes the development and validation of a Monte Carlo (MC) dose computing module dedicated to organ dose calculations of patients undergoing nuclear medicine (NM) internal radiation exposures involving 18F-FDG PET/CT examination. This new module extends the more-than-10-years-long ARCHER project that developed a GPU-accelerated MC dose engine by adding dedicated NM source-definition features. To validate the code, we compared dose distributions from the 0.511-MeV point photon source calculated for a water phantom as well as a patient PET/CT phantom against a well-tested MC code, GATE. The water-phantom results show excellent agreement, suggesting that the radiation physics module in the new NM code is adequate. To demonstrate the clinical utility and advantage of ARCHER-NM, one set of PET/CT data for an adult male NM patient is calculated using the new code. Radiosensitive organs in the CT dataset are segmented using a CNN-based tool called DeepViewer. The PET image intensity maps are converted to radioactivity distributions to allow for MC radiation transport dose calculations at the voxel level. The dose rate maps and corresponding statistical uncertainties were calculated for the duration of PET image acquisition. The dose rate results of the 18F-FDG PET imaging patient show that ARCHER-NM's results agree very well with those of the GATE within 0.58% to 4.11%. Most impressively, ARCHER-NM obtains such results in less than 0.5 minutes while it takes GATE as much as 376 minutes. This is the first study presenting GPU-accelerated patient-specific MC internal radiation dose rate calculations for clinically realistic 18F-FDG PET/CT imaging cases involving auto-segmentation of whole-body PET/CT images. This study suggests that modern computing tools -- ARCHER-NM and DeepViewer -- are accurate and fast enough for routine internal dosimetry in NM clinics.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Effects of Design and Hydrodynamic Parameters on Optimized Swimming for Simulated, Fish-inspired Robots
Authors:
Donghao Li,
Hankun Deng,
Yagiz E. Bayiz,
Bo Cheng
Abstract:
In this work we developed a mathematical model and a simulation platform for a fish-inspired robotic template, namely Magnetic, Modular, Undulatory Robotics ($μ$Bots). Through this platform, we systematically explored the effects of design and fluid parameters on the swimming performance via reinforcement learning. The mathematical model was composed of two interacting subsystems, the robot dynami…
▽ More
In this work we developed a mathematical model and a simulation platform for a fish-inspired robotic template, namely Magnetic, Modular, Undulatory Robotics ($μ$Bots). Through this platform, we systematically explored the effects of design and fluid parameters on the swimming performance via reinforcement learning. The mathematical model was composed of two interacting subsystems, the robot dynamics and the hydrodynamics, and the hydrodynamic model consisted of reactive components (added-mass and pressure forces) and resistive components (drag and friction forces), which were then nondimensionalized for deriving key "control parameters" of robot-fluid interaction. The $μ$Bot was actuated via magnetic actuators controlled with harmonic voltage signals, which were optimized via EM-based Policy Hyper Parameter Exploration (EPHE) to maximize swimming speed. By varying the control parameters, total 36 cases with different robot template variations (Number of Actuation (NoA) and stiffness) and hydrodynamic parameters were simulated and optimized via EPHE. Results showed that wavelength of optimized gaits (i.e., traveling wave along body) was independent of template variations and hydrodynamic parameters. Higher NoA yielded higher speed but lower speed per body length however with diminishing gain and lower speed per body length. Body and caudal-fin gait dynamics were dominated by the interaction among fluid added-mass, spring, and actuation torque, with negligible contribution from fluid resistive drag. In contrast, thrust generation was dominated by pressure force acting on caudal fin, as steady swimming resulted from a balance between resistive force and pressure force, with minor contributions from added-mass and body drag forces. Therefore, added-mass force only indirectly affected the thrust generation and swimming speed via the caudal fin dynamics.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
Optimal Inverted Landing in a Small Aerial Robot with Varied Approach Velocities and Landing Gear Designs
Authors:
Bryan Habas,
Bader AlAttar,
Brian Davis,
Jack W. Langelaan,
Bo Cheng
Abstract:
Inverted landing is a challenging feat to perform in aerial robots, especially without external positioning. However, it is routinely performed by biological fliers such as bees, flies, and bats. Our previous observations of landing behaviors in flies suggest an open-loop causal relationship between their putative visual cues and the kinematics of the aerial maneuvers executed. For example, the de…
▽ More
Inverted landing is a challenging feat to perform in aerial robots, especially without external positioning. However, it is routinely performed by biological fliers such as bees, flies, and bats. Our previous observations of landing behaviors in flies suggest an open-loop causal relationship between their putative visual cues and the kinematics of the aerial maneuvers executed. For example, the degree of rotational maneuver (the amount of body inversion prior to touchdown) and the amount of leg-assisted body swing both depend on the flies' initial body states while approaching the ceiling. In this work, inspired by the inverted landing behavior of flies, we used a physics-based simulation with experimental validation to systematically investigate how optimized inverted landing maneuvers depend on the initial approach velocities with varied magnitude and direction. This was done by analyzing the putative visual cues (that can be derived from onboard measurements) during optimal maneuvering trajectories. We identified a three-dimensional policy region, from which a map** to a global inverted landing policy can be developed without the use of external positioning data. Through simulation, we also investigated the effects of an array of landing gear designs on the optimized landing performance and identified their advantages and disadvantages. The above results have been partially validated using limited experimental testing and will continue to inform and guide our future experiments, for example by applying the calculated global policy.
△ Less
Submitted 3 March, 2022; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Near Resonance Approximation of Rotating Navier-Stokes Equations
Authors:
Bin Cheng,
Zisis N. Sakellaris
Abstract:
We formalise the concept of near resonance for the rotating Navier-Stokes equations, based on which we propose a novel way to approximate the original PDE. The spatial domain is a three-dimensional flat torus of arbitrary aspect ratios. We prove that the family of proposed PDEs are globally well-posed for any rotation rate and initial datum of any size in any $H^s$ space with $s\ge0$. Such approxi…
▽ More
We formalise the concept of near resonance for the rotating Navier-Stokes equations, based on which we propose a novel way to approximate the original PDE. The spatial domain is a three-dimensional flat torus of arbitrary aspect ratios. We prove that the family of proposed PDEs are globally well-posed for any rotation rate and initial datum of any size in any $H^s$ space with $s\ge0$. Such approximations retain much more 3-mode interactions, thus more accurate, than the conventional exact resonance approach. Our approach is free from any limiting argument that requires physical parameters to tend to zero or infinity, and is free from any small divisor argument (so estimates depend smoothly on the torus' aspect ratios). The key estimate hinges on counting of integer solutions of Diophantine inequalities rather than Diophantine equations. Using a range of novel ideas, we handle rigorously and optimally challenges arising from the non-trivial irrational functions in these inequalities. The main results and ingredients of the proofs can form part of the mathematical foundation of a non-asymptotic approach to nonlinear oscillatory dynamics in real-world applications.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Using Comics to Introduce and Reinforce Programming Concepts in CS1
Authors:
Sangho Suh,
Celine Latulipe,
Ken Jen Lee,
Bernadette Cheng,
Edith Law
Abstract:
Recent work investigated the potential of comics to support the teaching and learning of programming concepts and suggested several ways $coding$ $strips$, a form of comic strip with its corresponding code, can be used. Building on this work, we tested the recommended use cases of $coding$ $strip$ in an undergraduate introductory computer science course at a large comprehensive university. At the…
▽ More
Recent work investigated the potential of comics to support the teaching and learning of programming concepts and suggested several ways $coding$ $strips$, a form of comic strip with its corresponding code, can be used. Building on this work, we tested the recommended use cases of $coding$ $strip$ in an undergraduate introductory computer science course at a large comprehensive university. At the end of the course, we surveyed students to assess their experience and found they benefited in various ways. Our work contributes a demonstration of the various ways comics can be used in introductory CS courses and an initial understanding of benefits and challenges with using comics in computing education gleaned from an analysis of students' survey responses and code submissions.
△ Less
Submitted 27 September, 2021; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations
Authors:
Wei Chen,
Yeyun Gong,
Can Xu,
Huang Hu,
Bolun Yao,
Zhongyu Wei,
Zhihao Fan,
Xiaowu Hu,
Bartuer Zhou,
Biao Cheng,
Daxin Jiang,
Nan Duan
Abstract:
We study the problem of coarse-grained response selection in retrieval-based dialogue systems. The problem is equally important with fine-grained response selection, but is less explored in existing literature. In this paper, we propose a Contextual Fine-to-Coarse (CFC) distilled model for coarse-grained response selection in open-domain conversations. In our CFC model, dense representations of qu…
▽ More
We study the problem of coarse-grained response selection in retrieval-based dialogue systems. The problem is equally important with fine-grained response selection, but is less explored in existing literature. In this paper, we propose a Contextual Fine-to-Coarse (CFC) distilled model for coarse-grained response selection in open-domain conversations. In our CFC model, dense representations of query, candidate response and corresponding context is learned based on the multi-tower architecture, and more expressive knowledge learned from the one-tower architecture (fine-grained) is distilled into the multi-tower architecture (coarse-grained) to enhance the performance of the retriever. To evaluate the performance of our proposed model, we construct two new datasets based on the Reddit comments dump and Twitter corpus. Extensive experimental results on the two datasets show that the proposed methods achieve a significant improvement over all evaluation metrics compared with traditional baseline methods.
△ Less
Submitted 27 April, 2022; v1 submitted 24 September, 2021;
originally announced September 2021.
-
FCM: A Fine-grained Comparison Model for Multi-turn Dialogue Reasoning
Authors:
Xu Wang,
Hainan Zhang,
Shuai Zhao,
Yanyan Zou,
Hongshen Chen,
Zhuoye Ding,
Bo Cheng,
Yanyan Lan
Abstract:
Despite the success of neural dialogue systems in achieving high performance on the leader-board, they cannot meet users' requirements in practice, due to their poor reasoning skills. The underlying reason is that most neural dialogue models only capture the syntactic and semantic information, but fail to model the logical consistency between the dialogue history and the generated response. Recent…
▽ More
Despite the success of neural dialogue systems in achieving high performance on the leader-board, they cannot meet users' requirements in practice, due to their poor reasoning skills. The underlying reason is that most neural dialogue models only capture the syntactic and semantic information, but fail to model the logical consistency between the dialogue history and the generated response. Recently, a new multi-turn dialogue reasoning task has been proposed, to facilitate dialogue reasoning research. However, this task is challenging, because there are only slight differences between the illogical response and the dialogue history. How to effectively solve this challenge is still worth exploring. This paper proposes a Fine-grained Comparison Model (FCM) to tackle this problem. Inspired by human's behavior in reading comprehension, a comparison mechanism is proposed to focus on the fine-grained differences in the representation of each response candidate. Specifically, each candidate representation is compared with the whole history to obtain a history consistency representation. Furthermore, the consistency signals between each candidate and the speaker's own history are considered to drive a model to prefer a candidate that is logically consistent with the speaker's history logic. Finally, the above consistency representations are employed to output a ranking list of the candidate responses for multi-turn dialogue reasoning. Experimental results on two public dialogue datasets show that our method obtains higher ranking scores than the baseline models.
△ Less
Submitted 23 September, 2021; v1 submitted 22 September, 2021;
originally announced September 2021.
-
Scalable massively parallel computing using continuous-time data representation in nanoscale crossbar array
Authors:
Cong Wang,
Shi-Jun Liang,
Chen-Yu Wang,
Zai-Zheng Yang,
Yingmeng Ge,
Chen Pan,
Xi Shen,
Wei Wei,
Yichen Zhao,
Zaichen Zhang,
Bin Cheng,
Chuan Zhang,
Feng Miao
Abstract:
The growth of connected intelligent devices in the Internet of Things has created a pressing need for real-time processing and understanding of large volumes of analogue data. The difficulty in boosting the computing speed renders digital computing unable to meet the demand for processing analogue information that is intrinsically continuous in magnitude and time. By utilizing a continuous data re…
▽ More
The growth of connected intelligent devices in the Internet of Things has created a pressing need for real-time processing and understanding of large volumes of analogue data. The difficulty in boosting the computing speed renders digital computing unable to meet the demand for processing analogue information that is intrinsically continuous in magnitude and time. By utilizing a continuous data representation in a nanoscale crossbar array, parallel computing can be implemented for the direct processing of analogue information in real time. Here, we propose a scalable massively parallel computing scheme by exploiting a continuous-time data representation and frequency multiplexing in a nanoscale crossbar array. This computing scheme enables the parallel reading of stored data and the one-shot operation of matrix-matrix multiplications in the crossbar array. Furthermore, we achieve the one-shot recognition of 16 letter images based on two physically interconnected crossbar arrays and demonstrate that the processing and modulation of analogue information can be simultaneously performed in a memristive crossbar array.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios
Authors:
**gliang Duan,
Yangang Ren,
Fawang Zhang,
Yang Guan,
Dongjie Yu,
Shengbo Eben Li,
Bo Cheng,
Lin Zhao
Abstract:
In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in highe…
▽ More
In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in higher policy performance and generality. We first develop an encoding distributional policy iteration (DPI) framework by embedding a permutation invariant module, which employs a feature neural network (NN) to encode the indicators of each vehicle, in the distributional RL framework. The proposed DPI framework is proved to exhibit important properties in terms of convergence and global optimality. Next, based on the developed encoding DPI framework, we propose the E-DSAC algorithm by adding the gradient-based update rule of the feature NN to the policy evaluation process of the DSAC algorithm. Then, the multi-lane driving task and the corresponding reward function are designed to verify the effectiveness of the proposed algorithm. Results show that the policy learned by E-DSAC can realize efficient, smooth, and relatively safe autonomous driving in the designed scenario. And the final policy performance learned by E-DSAC is about three times that of DSAC. Furthermore, its effectiveness has also been verified in real vehicle experiments.
△ Less
Submitted 12 September, 2021;
originally announced September 2021.
-
Exploring Task Difficulty for Few-Shot Relation Extraction
Authors:
Jiale Han,
Bo Cheng,
Wei Lu
Abstract:
Few-shot relation extraction (FSRE) focuses on recognizing novel relations by learning with merely a handful of annotated instances. Meta-learning has been widely adopted for such a task, which trains on randomly generated few-shot tasks to learn generic data representations. Despite impressive results achieved, existing models still perform suboptimally when handling hard FSRE tasks, where the re…
▽ More
Few-shot relation extraction (FSRE) focuses on recognizing novel relations by learning with merely a handful of annotated instances. Meta-learning has been widely adopted for such a task, which trains on randomly generated few-shot tasks to learn generic data representations. Despite impressive results achieved, existing models still perform suboptimally when handling hard FSRE tasks, where the relations are fine-grained and similar to each other. We argue this is largely because existing models do not distinguish hard tasks from easy ones in the learning process. In this paper, we introduce a novel approach based on contrastive learning that learns better representations by exploiting relation label information. We further design a method that allows the model to adaptively learn how to focus on hard tasks. Experiments on two standard datasets demonstrate the effectiveness of our method.
△ Less
Submitted 24 October, 2021; v1 submitted 12 September, 2021;
originally announced September 2021.
-
Security and privacy for 6G: A survey on prospective technologies and challenges
Authors:
Van-Linh Nguyen,
Po-Ching Lin,
Bo-Chao Cheng,
Ren-Hung Hwang,
Ying-Dar Lin
Abstract:
Sixth-generation (6G) mobile networks will have to cope with diverse threats on a space-air-ground integrated network environment, novel technologies, and an accessible user information explosion. However, for now, security and privacy issues for 6G remain largely in concept. This survey provides a systematic overview of security and privacy issues based on prospective technologies for 6G in the p…
▽ More
Sixth-generation (6G) mobile networks will have to cope with diverse threats on a space-air-ground integrated network environment, novel technologies, and an accessible user information explosion. However, for now, security and privacy issues for 6G remain largely in concept. This survey provides a systematic overview of security and privacy issues based on prospective technologies for 6G in the physical, connection, and service layers, as well as through lessons learned from the failures of existing security architectures and state-of-the-art defenses. Two key lessons learned are as follows. First, other than inheriting vulnerabilities from the previous generations, 6G has new threat vectors from new radio technologies, such as the exposed location of radio stripes in ultra-massive MIMO systems at Terahertz bands and attacks against pervasive intelligence. Second, physical layer protection, deep network slicing, quantum-safe communications, artificial intelligence (AI) security, platform-agnostic security, real-time adaptive security, and novel data protection mechanisms such as distributed ledgers and differential privacy are the top promising techniques to mitigate the attack magnitude and personal data breaches substantially.
△ Less
Submitted 31 August, 2021; v1 submitted 26 August, 2021;
originally announced August 2021.
-
Room-Temperature Anisotropic Plasma Mirror and Polarization-Controlled Optical Switch Based on Type-II Weyl Semimetal WP2
Authors:
Kaixuan Zhang,
Yong** Du,
Zeming Qi,
Bin Cheng,
Xiaodong Fan,
Laiming Wei,
Lin Li,
Dongli Wang,
Guolin Yu,
Shuhong Hu,
Changhong Sun,
Zhiming Huang,
Junhao Chu,
Xiangang Wan,
Changgan Zeng
Abstract:
Anisotropy in electronic structures may ignite intriguing anisotropic optical responses, as has been well demonstrated in various systems including superconductors, semiconductors, and even topological Weyl semimetals. Meanwhile, it is well established in metal optics that the metal reflectance declines from one to zero when the photon frequency is above the plasma frequency ωp , behaving as a pla…
▽ More
Anisotropy in electronic structures may ignite intriguing anisotropic optical responses, as has been well demonstrated in various systems including superconductors, semiconductors, and even topological Weyl semimetals. Meanwhile, it is well established in metal optics that the metal reflectance declines from one to zero when the photon frequency is above the plasma frequency ωp , behaving as a plasma mirror. However, the exploration of anisotropic plasma mirrors and corresponding applications remains elusive, especially at room temperature. Here, we discover a pronounced anisotropic plasma reflectance edge in the type-II Weyl semimetal WP2, with an anisotropy ratio of ωp up to 1.5. Such anisotropic plasma mirror behavior and its robustness against temperature promise optical device applications over a wide temperature range. For example, the high sensitivity of polarization-resolved plasma reflectance edge renders WP2 an inherent polarization detector. We further achieve a room-temperature WP2-based optical switch, effectively controlled by simply tuning the light polarization. These findings extend the frontiers of metal optics as a discipline and promise the design of multifunctional devices combining both topological and optical features.
△ Less
Submitted 10 August, 2021; v1 submitted 7 August, 2021;
originally announced August 2021.
-
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Authors:
Bowen Cheng,
Alexander G. Schwing,
Alexander Kirillov
Abstract:
Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this ob…
▽ More
Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this observation, we propose MaskFormer, a simple mask classification model which predicts a set of binary masks, each associated with a single global class label prediction. Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results. In particular, we observe that MaskFormer outperforms per-pixel classification baselines when the number of classes is large. Our mask classification-based method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
△ Less
Submitted 31 October, 2021; v1 submitted 13 July, 2021;
originally announced July 2021.
-
Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving
Authors:
**gliang Duan,
Dongjie Yu,
Shengbo Eben Li,
Wenxuan Wang,
Yangang Ren,
Ziyu Lin,
Bo Cheng
Abstract:
In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and gene…
▽ More
In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and generality. The proposed ESC method introduces a representation neural network (NN) to encode each surrounding vehicle into an encoding vector, and then adds these vectors to obtain the representation vector of the set of surrounding vehicles. By concatenating the set representation with other variables, such as indicators of the ego vehicle and road, we realize the fixed-dimensional and permutation invariant state representation. This paper has further proved that the proposed ESC method can realize the injective representation if the output dimension of the representation NN is greater than the number of variables of all surrounding vehicles. This means that by taking the ESC representation as policy inputs, we can find the nearly optimal representation NN and policy NN by simultaneously optimizing them using gradient-based updating. Experiments demonstrate that compared with the fixed-permutation representation method, the proposed method improves the representation ability of the surrounding vehicles, and the corresponding approximation error is reduced by 62.2%.
△ Less
Submitted 4 March, 2022; v1 submitted 24 May, 2021;
originally announced May 2021.
-
DPN-SENet:A self-attention mechanism neural network for detection and diagnosis of COVID-19 from chest x-ray images
Authors:
Bo Cheng,
Ruhui Xue,
Hang Yang,
Laili Zhu,
Wei Xiang
Abstract:
Background and Objective: The new type of coronavirus is also called COVID-19. It began to spread at the end of 2019 and has now spread across the world. Until October 2020, It has infected around 37 million people and claimed about 1 million lives. We propose a deep learning model that can help radiologists and clinicians use chest X-rays to diagnose COVID-19 cases and show the diagnostic feature…
▽ More
Background and Objective: The new type of coronavirus is also called COVID-19. It began to spread at the end of 2019 and has now spread across the world. Until October 2020, It has infected around 37 million people and claimed about 1 million lives. We propose a deep learning model that can help radiologists and clinicians use chest X-rays to diagnose COVID-19 cases and show the diagnostic features of pneumonia. Methods: The approach in this study is: 1) we propose a data enhancement method to increase the diversity of the data set, thereby improving the generalization performance of the model. 2) Our deep convolution neural network model DPN-SE adds a self-attention mechanism to the DPN network. The addition of a self-attention mechanism has greatly improved the performance of the network. 3) Use the Lime interpretable library to mark the feature regions on the X-ray medical image that helps doctors more quickly diagnose COVID-19 in people. Results: Under the same network model, the data with and without data enhancement is put into the model for training respectively. At last, comparing two experimental results: among the 10 network models with different structures, 7 network models have improved their effects after using data enhancement, with an average improvement of 1% in recognition accuracy. We propose that the accuracy and recall rates of the DPN-SE network are 93% and 98% of cases (COVID vs. pneumonia bacteria vs. viral pneumonia vs. normal). Compared with the original DPN, the respective accuracy is improved by 2%. Conclusion: The data augmentation method we used has achieved effective results on a small amount of data set, showing that a reasonable data augmentation method can improve the recognition accuracy without changing the sample size and model structure. Overall, the proposed method and model can effectively become a very useful tool for clinical radiologists.
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
Structure and magnetic properties of the $S=3/2$ zigzag spin chain antiferromagnet BaCoTe$_2$O$_7$
Authors:
Lisi Li,
Xunwu Hu,
Zengjia Liu,
Jia Yu,
Benyuan Cheng,
Sihao Deng,
Lunhua He,
Kun Cao,
Dao-Xin Yao,
Meng Wang
Abstract:
We report an investigation on structure and magnetic properties of the $S=3/2$ zigzag spin chain compound BaCoTe$_2$O$_7$. Neutron diffraction measurements reveal BaCoTe$_2$O$_7$ crystallizes in the noncentrosymmetric space group $Ama2$ with a canted $\uparrow\uparrow\downarrow\downarrow$ spin structure along the quasi-one-dimensional zigzag chain and a moment size of $1.89(2)μ_B$ at 2 K. Magnetic…
▽ More
We report an investigation on structure and magnetic properties of the $S=3/2$ zigzag spin chain compound BaCoTe$_2$O$_7$. Neutron diffraction measurements reveal BaCoTe$_2$O$_7$ crystallizes in the noncentrosymmetric space group $Ama2$ with a canted $\uparrow\uparrow\downarrow\downarrow$ spin structure along the quasi-one-dimensional zigzag chain and a moment size of $1.89(2)μ_B$ at 2 K. Magnetic susceptibility and specific heat measurements yield an antiferromagnetic phase transition at $T_N=6.2$ K. A negative Curie-Weiss temperature $Θ_{CW}=-74.7(2)$ K and an empirical frustration parameter of $f=|Θ_\text{CW}|/T_\text{N}\approx12$ is obtained from fitting the magnetic susceptibility, indicating antiferromagnetic interactions and strong magnetic frustration. By employing ultraviolet-visible absorption spectroscopy and first principles calculations, an indirect band gap of 2.68(2) eV is determined. We propose that the canted zigzag spin chain of BaCoTe$_2$O$_7$ may produce a change of the polarization via exchange striction mechanism.
△ Less
Submitted 5 July, 2021; v1 submitted 20 May, 2021;
originally announced May 2021.
-
Sand Creep Motion in Slow Spin-up Experiment: An Analogue of Regolith Migration on Asteroids
Authors:
Chenyang Huang,
Yang Yu,
Bin Cheng,
Kaiming Zhang,
Dong Qiao,
Hexi Baoyin
Abstract:
We studied the creep motion of granular materials in a gradient potential field that is created using a slow spin-up experiment device. Natural sand confined in the acrylic box is spun up by a controlled turntable and the surface flows are captured using video-based measurements. Various spin-up accelerations were considered to understand the responses of creep motion on different accelerating pat…
▽ More
We studied the creep motion of granular materials in a gradient potential field that is created using a slow spin-up experiment device. Natural sand confined in the acrylic box is spun up by a controlled turntable and the surface flows are captured using video-based measurements. Various spin-up accelerations were considered to understand the responses of creep motion on different accelerating paths. Convergent behaviors in the morphological change of sand surface were observed in the final steady state. To quantify the quasi-static spin-up process, we examined the net flux and the surface slope as a function of the spin rate and offset from the rotation axis. Evolution of sand creep motion demonstrated behaviors similar to regolith migration in numeric simulations, showing intermittency like general sheared granular systems. We noticed the sand surface approaches criticality as the spin-up proceeding, consistent with the observation that top-shaped asteroids near limiting spin rate take on critical shape. Comparisons to large-scale numeric simulations and analytical solutions reveal underlying similarities between our experiments and the million-year evolution of asteroid regolith under YORP acceleration, which raises the possibility of studying asteroid surface processes in laboratory analogue experiments.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Ranking the information content of distance measures
Authors:
Aldo Glielmo,
Claudio Zeni,
Bingqing Cheng,
Gabor Csanyi,
Alessandro Laio
Abstract:
Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Using the fewest features but still retaining sufficient information about the system is crucial in many statistical learning approaches,…
▽ More
Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Using the fewest features but still retaining sufficient information about the system is crucial in many statistical learning approaches, particularly when data are sparse. We introduce a statistical test that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This in turn allows finding the most informative distance measure out of a pool of candidates. The approach is applied to find the most relevant policy variables for controlling the Covid-19 epidemic and to find compact yet informative representations of atomic structures, but its potential applications are wide ranging in many branches of science.
△ Less
Submitted 25 May, 2022; v1 submitted 30 April, 2021;
originally announced April 2021.
-
Rethinking Ensemble-Distillation for Semantic Segmentation Based Unsupervised Domain Adaptation
Authors:
Chen-Hao Chao,
Bo-Wun Cheng,
Chun-Yi Lee
Abstract:
Recent researches on unsupervised domain adaptation (UDA) have demonstrated that end-to-end ensemble learning frameworks serve as a compelling option for UDA tasks. Nevertheless, these end-to-end ensemble learning methods often lack flexibility as any modification to the ensemble requires retraining of their frameworks. To address this problem, we propose a flexible ensemble-distillation framework…
▽ More
Recent researches on unsupervised domain adaptation (UDA) have demonstrated that end-to-end ensemble learning frameworks serve as a compelling option for UDA tasks. Nevertheless, these end-to-end ensemble learning methods often lack flexibility as any modification to the ensemble requires retraining of their frameworks. To address this problem, we propose a flexible ensemble-distillation framework for performing semantic segmentation based UDA, allowing any arbitrary composition of the members in the ensemble while still maintaining its superior performance. To achieve such flexibility, our framework is designed to be robust against the output inconsistency and the performance variation of the members within the ensemble. To examine the effectiveness and the robustness of our method, we perform an extensive set of experiments on both GTA5 to Cityscapes and SYNTHIA to Cityscapes benchmarks to quantitatively inspect the improvements achievable by our method. We further provide detailed analyses to validate that our design choices are practical and beneficial. The experimental evidence validates that the proposed method indeed offer superior performance, robustness and flexibility in semantic segmentation based UDA tasks against contemporary baseline methods.
△ Less
Submitted 29 April, 2021;
originally announced April 2021.
-
Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection
Authors:
Jiachen Li,
Bowen Cheng,
Rogerio Feris,
**jun Xiong,
Thomas S. Huang,
Wen-Mei Hwu,
Humphrey Shi
Abstract:
Current anchor-free object detectors are quite simple and effective yet lack accurate label assignment methods, which limits their potential in competing with classic anchor-based models that are supported by well-designed assignment methods based on the Intersection-over-Union~(IoU) metric. In this paper, we present \textbf{Pseudo-Intersection-over-Union~(Pseudo-IoU)}: a simple metric that brings…
▽ More
Current anchor-free object detectors are quite simple and effective yet lack accurate label assignment methods, which limits their potential in competing with classic anchor-based models that are supported by well-designed assignment methods based on the Intersection-over-Union~(IoU) metric. In this paper, we present \textbf{Pseudo-Intersection-over-Union~(Pseudo-IoU)}: a simple metric that brings more standardized and accurate assignment rule into anchor-free object detection frameworks without any additional computational cost or extra parameters for training and testing, making it possible to further improve anchor-free object detection by utilizing training samples of good quality under effective assignment rules that have been previously applied in anchor-based methods. By incorporating Pseudo-IoU metric into an end-to-end single-stage anchor-free object detection framework, we observe consistent improvements in their performance on general object detection benchmarks such as PASCAL VOC and MSCOCO. Our method (single-model and single-scale) also achieves comparable performance to other recent state-of-the-art anchor-free methods without bells and whistles. Our code is based on mmdetection toolbox and will be made publicly available at https://github.com/SHI-Labs/Pseudo-IoU-for-Anchor-Free-Object-Detection.
△ Less
Submitted 28 April, 2021;
originally announced April 2021.
-
ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation
Authors:
Weizhen Qi,
Yeyun Gong,
Yu Yan,
Can Xu,
Bolun Yao,
Bartuer Zhou,
Biao Cheng,
Daxin Jiang,
Jiusheng Chen,
Ruofei Zhang,
Houqiang Li,
Nan Duan
Abstract:
Now, the pre-training technique is ubiquitous in natural language processing field. ProphetNet is a pre-training based natural language generation method which shows powerful performance on English text summarization and question generation tasks. In this paper, we extend ProphetNet into other domains and languages, and present the ProphetNet family pre-training models, named ProphetNet-X, where X…
▽ More
Now, the pre-training technique is ubiquitous in natural language processing field. ProphetNet is a pre-training based natural language generation method which shows powerful performance on English text summarization and question generation tasks. In this paper, we extend ProphetNet into other domains and languages, and present the ProphetNet family pre-training models, named ProphetNet-X, where X can be English, Chinese, Multi-lingual, and so on. We pre-train a cross-lingual generation model ProphetNet-Multi, a Chinese generation model ProphetNet-Zh, two open-domain dialog generation models ProphetNet-Dialog-En and ProphetNet-Dialog-Zh. And also, we provide a PLG (Programming Language Generation) model ProphetNet-Code to show the generation performance besides NLG (Natural Language Generation) tasks. In our experiments, ProphetNet-X models achieve new state-of-the-art performance on 10 benchmarks. All the models of ProphetNet-X share the same model structure, which allows users to easily switch between different models. We make the code and models publicly available, and we will keep updating more pre-training models and finetuning scripts.
△ Less
Submitted 22 June, 2021; v1 submitted 16 April, 2021;
originally announced April 2021.
-
Temperature-sensitive spatial distribution of defects in PdSe2 flakes
Authors:
Xiaowei Liu,
Yaojia Wang,
Qiqi Guo,
Shijun Liang,
Tao Xu,
Bo Liu,
Jiabin Qiao,
Shengqiang Lai,
Junwen Zeng,
Song Hao,
Chenyi Gu,
Tianjun Cao,
Chenyu Wang,
Yu Wang,
Chen Pan,
Guangxu Su,
Yuefeng Nie,
Xiangang Wan,
Litao Sun,
Zhenlin Wang,
Lin He,
Bin Cheng,
Feng Miao
Abstract:
Defect engineering plays an important role in tailoring the electronic transport properties of van der Waals materials. However, it is usually achieved through tuning the type and concentration of defects, rather than dynamically reconfiguring their spatial distribution. Here, we report temperature-sensitive spatial redistribution of defects in PdSe2 thin flakes through scanning tunneling microsco…
▽ More
Defect engineering plays an important role in tailoring the electronic transport properties of van der Waals materials. However, it is usually achieved through tuning the type and concentration of defects, rather than dynamically reconfiguring their spatial distribution. Here, we report temperature-sensitive spatial redistribution of defects in PdSe2 thin flakes through scanning tunneling microscopy (STM). We observe that the spatial distribution of Se vacancies in PdSe2 flakes exhibits a strong anisotropic characteristic at 80 K, and that this orientation-dependent feature is weakened when temperature is raised. Moreover, we carry out transport measurements on PdSe2 thin flakes and show that the anisotropic features of carrier mobility and phase coherent length are also sensitive to temperature. Combining with theoretical analysis, we conclude that temperature-driven defect spatial redistribution could interpret the temperature-sensitive electrical transport behaviors in PdSe2 thin flakes. Our work highlights that engineering spatial distribution of defects in the van der Waals materials, which has been overlooked before, may open up a new avenue to tailor the physical properties of materials and explore new device functionalities.
△ Less
Submitted 14 April, 2021;
originally announced April 2021.
-
Pointly-Supervised Instance Segmentation
Authors:
Bowen Cheng,
Omkar Parkhi,
Alexander Kirillov
Abstract:
We propose an embarrassingly simple point annotation scheme to collect weak supervision for instance segmentation. In addition to bounding boxes, we collect binary labels for a set of points uniformly sampled inside each bounding box. We show that the existing instance segmentation models developed for full mask supervision can be seamlessly trained with point-based supervision collected via our s…
▽ More
We propose an embarrassingly simple point annotation scheme to collect weak supervision for instance segmentation. In addition to bounding boxes, we collect binary labels for a set of points uniformly sampled inside each bounding box. We show that the existing instance segmentation models developed for full mask supervision can be seamlessly trained with point-based supervision collected via our scheme. Remarkably, Mask R-CNN trained on COCO, PASCAL VOC, Cityscapes, and LVIS with only 10 annotated random points per object achieves 94%--98% of its fully-supervised performance, setting a strong baseline for weakly-supervised instance segmentation. The new point annotation scheme is approximately 5 times faster than annotating full object masks, making high-quality instance segmentation more accessible in practice.
Inspired by the point-based annotation form, we propose a modification to PointRend instance segmentation module. For each object, the new architecture, called Implicit PointRend, generates parameters for a function that makes the final point-level mask prediction. Implicit PointRend is more straightforward and uses a single point-level mask loss. Our experiments show that the new module is more suitable for the point-based supervision.
△ Less
Submitted 15 June, 2022; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds
Authors:
Bowen Cheng,
Lu Sheng,
Shaoshuai Shi,
Ming Yang,
Dong Xu
Abstract:
3D object detection in point clouds is a challenging vision task that benefits various applications for understanding the 3D visual world. Lots of recent research focuses on how to exploit end-to-end trainable Hough voting for generating object proposals. However, the current voting strategy can only receive partial votes from the surfaces of potential objects together with severe outlier votes fr…
▽ More
3D object detection in point clouds is a challenging vision task that benefits various applications for understanding the 3D visual world. Lots of recent research focuses on how to exploit end-to-end trainable Hough voting for generating object proposals. However, the current voting strategy can only receive partial votes from the surfaces of potential objects together with severe outlier votes from the cluttered backgrounds, which hampers full utilization of the information from the input point clouds. Inspired by the back-tracing strategy in the conventional Hough voting methods, in this work, we introduce a new 3D object detection method, named as Back-tracing Representative Points Network (BRNet), which generatively back-traces the representative points from the vote centers and also revisits complementary seed points around these generated points, so as to better capture the fine local structural features surrounding the potential objects from the raw point clouds. Therefore, this bottom-up and then top-down strategy in our BRNet enforces mutual consistency between the predicted vote centers and the raw surface points and thus achieves more reliable and flexible object localization and class prediction results. Our BRNet is simple but effective, which significantly outperforms the state-of-the-art methods on two large-scale point cloud datasets, ScanNet V2 (+7.5% in terms of [email protected]) and SUN RGB-D (+4.7% in terms of [email protected]), while it is still lightweight and efficient. Code will be available at https://github.com/cheng052/BRNet.
△ Less
Submitted 14 April, 2021; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Integrating Subgraph-aware Relation and DirectionReasoning for Question Answering
Authors:
Xu Wang,
Shuai Zhao,
Bo Cheng,
Jiale Han,
Yingting Li,
Hao Yang,
Ivan Sekulic,
Guoshun Nan
Abstract:
Question Answering (QA) models over Knowledge Bases (KBs) are capable of providing more precise answers by utilizing relation information among entities. Although effective, most of these models solely rely on fixed relation representations to obtain answers for different question-related KB subgraphs. Hence, the rich structured information of these subgraphs may be overlooked by the relation repr…
▽ More
Question Answering (QA) models over Knowledge Bases (KBs) are capable of providing more precise answers by utilizing relation information among entities. Although effective, most of these models solely rely on fixed relation representations to obtain answers for different question-related KB subgraphs. Hence, the rich structured information of these subgraphs may be overlooked by the relation representation vectors. Meanwhile, the direction information of reasoning, which has been proven effective for the answer prediction on graphs, has not been fully explored in existing work. To address these challenges, we propose a novel neural model, Relation-updated Direction-guided Answer Selector (RDAS), which converts relations in each subgraph to additional nodes to learn structure information. Additionally, we utilize direction information to enhance the reasoning ability. Experimental results show that our model yields substantial improvements on two widely used datasets.
△ Less
Submitted 31 March, 2021;
originally announced April 2021.
-
Boundary IoU: Improving Object-Centric Image Segmentation Evaluation
Authors:
Bowen Cheng,
Ross Girshick,
Piotr Dollár,
Alexander C. Berg,
Alexander Kirillov
Abstract:
We present Boundary IoU (Intersection-over-Union), a new segmentation evaluation measure focused on boundary quality. We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects. The new quality me…
▽ More
We present Boundary IoU (Intersection-over-Union), a new segmentation evaluation measure focused on boundary quality. We perform an extensive analysis across different error types and object sizes and show that Boundary IoU is significantly more sensitive than the standard Mask IoU measure to boundary errors for large objects and does not over-penalize errors on smaller objects. The new quality measure displays several desirable characteristics like symmetry w.r.t. prediction/ground truth pairs and balanced responsiveness across scales, which makes it more suitable for segmentation evaluation than other boundary-focused measures like Trimap IoU and F-measure. Based on Boundary IoU, we update the standard evaluation protocols for instance and panoptic segmentation tasks by proposing the Boundary AP (Average Precision) and Boundary PQ (Panoptic Quality) metrics, respectively. Our experiments show that the new evaluation metrics track boundary quality improvements that are generally overlooked by current Mask IoU-based evaluation metrics. We hope that the adoption of the new boundary-sensitive evaluation metrics will lead to rapid progress in segmentation methods that improve boundary quality.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
Integrated Decision and Control: Towards Interpretable and Computationally Efficient Driving Intelligence
Authors:
Yang Guan,
Yangang Ren,
Qi Sun,
Shengbo Eben Li,
Haitong Ma,
**gliang Duan,
Yifan Dai,
Bo Cheng
Abstract:
Decision and control are core functionalities of high-level automated vehicles. Current mainstream methods, such as functionality decomposition and end-to-end reinforcement learning (RL), either suffer high time complexity or poor interpretability and adaptability on real-world autonomous driving tasks. In this paper, we present an interpretable and computationally efficient framework called integ…
▽ More
Decision and control are core functionalities of high-level automated vehicles. Current mainstream methods, such as functionality decomposition and end-to-end reinforcement learning (RL), either suffer high time complexity or poor interpretability and adaptability on real-world autonomous driving tasks. In this paper, we present an interpretable and computationally efficient framework called integrated decision and control (IDC) for automated vehicles, which decomposes the driving task into static path planning and dynamic optimal tracking that are structured hierarchically. First, the static path planning generates several candidate paths only considering static traffic elements. Then, the dynamic optimal tracking is designed to track the optimal path while considering the dynamic obstacles. To that end, we formulate a constrained optimal control problem (OCP) for each candidate path, optimize them separately and follow the one with the best tracking performance. To unload the heavy online computation, we propose a model-based reinforcement learning (RL) algorithm that can be served as an approximate constrained OCP solver. Specifically, the OCPs for all paths are considered together to construct a single complete RL problem and then solved offline in the form of value and policy networks, for real-time online path selecting and tracking respectively. We verify our framework in both simulations and the real world. Results show that compared with baseline methods IDC has an order of magnitude higher online computing efficiency, as well as better driving performance including traffic efficiency and safety. In addition, it yields great interpretability and adaptability among different driving tasks. The effectiveness of the proposed method is also demonstrated in real road tests with complicated traffic conditions.
△ Less
Submitted 11 May, 2021; v1 submitted 18 March, 2021;
originally announced March 2021.
-
Predicting the phase behaviors of superionic water at planetary conditions
Authors:
Bingqing Cheng,
Mandy Bethkenhagen,
Chris J. Pickard,
Sebastien Hamel
Abstract:
Most water in the universe may be superionic, and its thermodynamic and transport properties are crucial for planetary science but difficult to probe experimentally or theoretically. We use machine learning and free energy methods to overcome the limitations of quantum mechanical simulations, and characterize hydrogen diffusion, superionic transitions, and phase behaviors of water at extreme condi…
▽ More
Most water in the universe may be superionic, and its thermodynamic and transport properties are crucial for planetary science but difficult to probe experimentally or theoretically. We use machine learning and free energy methods to overcome the limitations of quantum mechanical simulations, and characterize hydrogen diffusion, superionic transitions, and phase behaviors of water at extreme conditions. We predict that a close-packed superionic phase with mixed stacking is stable over a wide temperature and pressure range, while a body-centered cubic phase is only thermodynamically stable in a small window but is kinetically favored. Our phase boundaries, which are consistent with the existing-albeit scarce-experimental observations, help resolve the fractions of insulating ice, different superionic phases, and liquid water inside of ice giants.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
State-space aerodynamic model reveals high force control authority and predictability in flap** flight
Authors:
Yagiz E. Bayiz,
Bo Cheng
Abstract:
Flying animals resort to fast, large-degree-of-freedom motion of flap** wings, a key feature that distinguishes them from rotary or fixed-winged robotic fliers with limited motion of aerodynamic surfaces. However, flap**-wing aerodynamics are characterised by highly unsteady and three-dimensional flows difficult to model or control, and accurate aerodynamic force predictions often rely on expe…
▽ More
Flying animals resort to fast, large-degree-of-freedom motion of flap** wings, a key feature that distinguishes them from rotary or fixed-winged robotic fliers with limited motion of aerodynamic surfaces. However, flap**-wing aerodynamics are characterised by highly unsteady and three-dimensional flows difficult to model or control, and accurate aerodynamic force predictions often rely on expensive computational or experimental methods. Here, we developed a computationally efficient and data-driven state-space model to dynamically map wing kinematics to aerodynamic forces/moments. This model was trained and tested with a total of 548 different flap**-wing motions and surpassed the accuracy and generality of the existing quasi-steady models. This model used 12 states to capture the unsteady and nonlinear fluid effects pertinent to force generation without explicit information of fluid flows. We also provided a comprehensive assessment of the control authority of key wing kinematic variables and found that instantaneous aerodynamic forces/moments were largely predictable by the wing motion history within a half-stroke cycle. Furthermore, the angle of attack, normal acceleration, and pitching motion had the strongest effects on the aerodynamic force/moment generation. Our results show that flap** flight inherently offers high force control authority and predictability, which can be key to develo** agile and stable aerial fliers.
△ Less
Submitted 4 August, 2021; v1 submitted 14 March, 2021;
originally announced March 2021.
-
Recurrent Model Predictive Control
Authors:
Zhengyu Liu,
**gliang Duan,
Wenxuan Wang,
Shengbo Eben Li,
Yuming Yin,
Ziyu Lin,
Qi Sun,
Bo Cheng
Abstract:
This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the…
▽ More
This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The number of prediction steps is equal to the number of recurrent cycles of the learned policy function. With an arbitrary initial policy function, the proposed RMPC algorithm can converge to the optimal policy by directly minimizing the designed loss function. We further prove the convergence and optimality of the RMPC algorithm thorough Bellman optimality principle, and demonstrate its generality and efficiency using two numerical examples.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Mixed Policy Gradient: off-policy reinforcement learning driven jointly by data and model
Authors:
Yang Guan,
**gliang Duan,
Shengbo Eben Li,
Jie Li,
Jianyu Chen,
Bo Cheng
Abstract:
Reinforcement learning (RL) shows great potential in sequential decision-making. At present, mainstream RL algorithms are data-driven, which usually yield better asymptotic performance but much slower convergence compared with model-driven methods. This paper proposes mixed policy gradient (MPG) algorithm, which fuses the empirical data and the transition model in policy gradient (PG) to accelerat…
▽ More
Reinforcement learning (RL) shows great potential in sequential decision-making. At present, mainstream RL algorithms are data-driven, which usually yield better asymptotic performance but much slower convergence compared with model-driven methods. This paper proposes mixed policy gradient (MPG) algorithm, which fuses the empirical data and the transition model in policy gradient (PG) to accelerate convergence without performance degradation. Formally, MPG is constructed as a weighted average of the data-driven and model-driven PGs, where the former is the derivative of the learned Q-value function, and the latter is that of the model-predictive return. To guide the weight design, we analyze and compare the upper bound of each PG error. Relying on that, a rule-based method is employed to heuristically adjust the weights. In particular, to get a better PG, the weight of the data-driven PG is designed to grow along the learning process while the other to decrease. Simulation results show that the MPG method achieves the best asymptotic performance and convergence speed compared with other baseline algorithms.
△ Less
Submitted 24 February, 2024; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Recurrent Model Predictive Control: Learning an Explicit Recurrent Controller for Nonlinear Systems
Authors:
Zhengyu Liu,
**gliang Duan,
Wenxuan Wang,
Shengbo Eben Li,
Yuming Yin,
Ziyu Lin,
Bo Cheng
Abstract:
This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems. It can be regarded as an explicit solver of traditional Model Predictive Control (MPC) algorithms, which can adaptively select appropriate model prediction horizon according to current computing resources, so as to improve the p…
▽ More
This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems. It can be regarded as an explicit solver of traditional Model Predictive Control (MPC) algorithms, which can adaptively select appropriate model prediction horizon according to current computing resources, so as to improve the policy performance. Our algorithm employs a recurrent function to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The output of the learned policy network after N recurrent cycles corresponds to the nearly optimal solution of N-step MPC. A policy optimization objective is designed by decomposing the MPC cost function according to the Bellman's principle of optimality. The optimal recurrent policy can be obtained by directly minimizing the designed objective function, which is applicable for general nonlinear and non input-affine systems. Both simulation-based and real-robot path-tracking tasks are utilized to demonstrate the effectiveness of the proposed method.
△ Less
Submitted 8 April, 2022; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems
Authors:
John A. Keith,
Valentin Vassilev-Galindo,
Bingqing Cheng,
Stefan Chmiela,
Michael Gastegger,
Klaus-Robert Müller,
Alexandre Tkatchenko
Abstract:
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This review is written for new and experienced researchers working at t…
▽ More
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We then follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
Wiener Filter versus Recurrent Neural Network-based 2D-Channel Estimation for V2X Communications
Authors:
Moritz Benedikt Fischer,
Sebastian Dörner,
Sebastian Cammerer,
Takayuki Shimizu,
Bin Cheng,
Hongsheng Lu,
Stephan ten Brink
Abstract:
We compare the potential of neural network (NN)-based channel estimation with classical linear minimum mean square error (LMMSE)-based estimators, also known as Wiener filtering. For this, we propose a low-complexity recurrent neural network (RNN)-based estimator that allows channel equalization of a sequence of channel observations based on independent time- and frequency-domain long short-term m…
▽ More
We compare the potential of neural network (NN)-based channel estimation with classical linear minimum mean square error (LMMSE)-based estimators, also known as Wiener filtering. For this, we propose a low-complexity recurrent neural network (RNN)-based estimator that allows channel equalization of a sequence of channel observations based on independent time- and frequency-domain long short-term memory (LSTM) cells. Motivated by Vehicle-to-Everything (V2X) applications, we simulate time- and frequency-selective channels with orthogonal frequency division multiplex (OFDM) and extend our channel models in such a way that a continuous degradation from line-of-sight (LoS) to non-line-of-sight (NLoS) conditions can be emulated. It turns out that the NN-based system cannot just compete with the LMMSE equalizer, but it also can be trained w.r.t. resilience against system parameter mismatch. We thereby showcase the conceptual simplicity of such a data-driven system design, as this not only enables more robustness against, e.g., signal-to-noise-ratio (SNR) or Doppler spread estimation mismatches, but also allows to use the same equalizer over a wider range of input parameters without the need of re-building (or re-estimating) the filter coefficients. Particular attention has been paid to ensure compatibility with the existing IEEE 802.11p piloting scheme for V2X communications. Finally, feeding the payload data symbols as additional equalizer input unleashes further performance gains. We show significant gains over the conventional LMMSE equalization for highly dynamic channel conditions if such a data-augmented equalization scheme is used.
△ Less
Submitted 21 May, 2021; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Separated transport relaxation scales and interband scattering in SrRuO$_3$, CaRuO$_3$, and Sr$_2$RuO$_4$ thin films
Authors:
Youcheng Wang,
H. P. Nair,
N. J. Schreiber,
J. P. Ruf,
Bing Cheng,
D. G. Schlom,
K. M. Shen,
N. P. Armitage
Abstract:
The anomalous charge transport observed in some strongly correlated metals raises questions as to the universal applicability of Landau Fermi liquid theory. The coherence temperature $T_{FL}$ for normal metals is usually taken to be the temperature below which $T^2$ is observed in the resistivity. Below this temperature, a Fermi liquid with well-defined quasiparticles is expected. However, metalli…
▽ More
The anomalous charge transport observed in some strongly correlated metals raises questions as to the universal applicability of Landau Fermi liquid theory. The coherence temperature $T_{FL}$ for normal metals is usually taken to be the temperature below which $T^2$ is observed in the resistivity. Below this temperature, a Fermi liquid with well-defined quasiparticles is expected. However, metallic ruthenates in the Ruddlesden-Popper family, frequently show non-Drude low-energy optical conductivity and unusual $ω/T$ scaling, despite the frequent observation of $T^2$ dc resistivity. Herein we report time-domain THz spectroscopy measurements of several different high-quality metallic ruthenate thin films and show that the optical conductivity can be interpreted in more conventional terms. In all materials, the conductivity has a two-Drude peak lineshape at low temperature and a crossover to a one-Drude peak lineshape at higher temperatures. The two-component low-temperature conductivity is indicative of two well-separated current relaxation rates for different conduction channels. We discuss three particular possibilities for the separation of rates: (a) Strongly energy-dependent inelastic scattering; (b) an almost-conserved pseudomomentum operator that overlaps with the current, giving rise to the narrower Drude peak; (c) the presence of multiple conduction channels that undergoes a crossover to stronger interband scattering at higher temperatures. None of these scenarios require the existence of exotic quasiparticles. The results may give insight into the possible significance of Hund's coupling in determining interband coupling in these materials. Our results also show a route towards understanding the violation of Matthiessen's rule in this class of materials and deviations from the "Gurzhi" scaling relations in Fermi liquids.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
ScaleNAS: One-Shot Learning of Scale-Aware Representations for Visual Recognition
Authors:
Hsin-Pai Cheng,
Feng Liang,
Meng Li,
Bowen Cheng,
Feng Yan,
Hai Li,
Vikas Chandra,
Yiran Chen
Abstract:
Scale variance among different sizes of body parts and objects is a challenging problem for visual recognition tasks. Existing works usually design dedicated backbone or apply Neural architecture Search(NAS) for each task to tackle this challenge. However, existing works impose significant limitations on the design or search space. To solve these problems, we present ScaleNAS, a one-shot learning…
▽ More
Scale variance among different sizes of body parts and objects is a challenging problem for visual recognition tasks. Existing works usually design dedicated backbone or apply Neural architecture Search(NAS) for each task to tackle this challenge. However, existing works impose significant limitations on the design or search space. To solve these problems, we present ScaleNAS, a one-shot learning method for exploring scale-aware representations. ScaleNAS solves multiple tasks at a time by searching multi-scale feature aggregation. ScaleNAS adopts a flexible search space that allows an arbitrary number of blocks and cross-scale feature fusions. To cope with the high search cost incurred by the flexible space, ScaleNAS employs one-shot learning for multi-scale supernet driven by grouped sampling and evolutionary search. Without further retraining, ScaleNet can be directly deployed for different visual recognition tasks with superior performance. We use ScaleNAS to create high-resolution models for two different tasks, ScaleNet-P for human pose estimation and ScaleNet-S for semantic segmentation. ScaleNet-P and ScaleNet-S outperform existing manually crafted and NAS-based methods in both tasks. When applying ScaleNet-P to bottom-up human pose estimation, it surpasses the state-of-the-art HigherHRNet. In particular, ScaleNet-P4 achieves 71.6% AP on COCO test-dev, achieving new state-of-the-art result.
△ Less
Submitted 30 November, 2020;
originally announced November 2020.
-
End-to-End Video Instance Segmentation with Transformers
Authors:
Yuqing Wang,
Zhaoliang Xu,
Xinlong Wang,
Chunhua Shen,
Baoshan Cheng,
Hao Shen,
Huaxia Xia
Abstract:
Video instance segmentation (VIS) is the task that requires simultaneously classifying, segmenting and tracking object instances of interest in video. Recent methods typically develop sophisticated pipelines to tackle this task. Here, we propose a new video instance segmentation framework built upon Transformers, termed VisTR, which views the VIS task as a direct end-to-end parallel sequence decod…
▽ More
Video instance segmentation (VIS) is the task that requires simultaneously classifying, segmenting and tracking object instances of interest in video. Recent methods typically develop sophisticated pipelines to tackle this task. Here, we propose a new video instance segmentation framework built upon Transformers, termed VisTR, which views the VIS task as a direct end-to-end parallel sequence decoding/prediction problem. Given a video clip consisting of multiple image frames as input, VisTR outputs the sequence of masks for each instance in the video in order directly. At the core is a new, effective instance sequence matching and segmentation strategy, which supervises and segments instances at the sequence level as a whole. VisTR frames the instance segmentation and tracking in the same perspective of similarity learning, thus considerably simplifying the overall pipeline and is significantly different from existing approaches. Without bells and whistles, VisTR achieves the highest speed among all existing VIS models, and achieves the best result among methods using single model on the YouTube-VIS dataset. For the first time, we demonstrate a much simpler and faster video instance segmentation framework built upon Transformers, achieving competitive accuracy. We hope that VisTR can motivate future research for more video understanding tasks.
△ Less
Submitted 8 October, 2021; v1 submitted 29 November, 2020;
originally announced November 2020.
-
Quantum-mechanical exploration of the phase diagram of water
Authors:
Aleks Reinhardt,
Bingqing Cheng
Abstract:
The phase diagram of water harbours many mysteries: some of the phase boundaries are fuzzy, and the set of known stable phases may not be complete. Starting from liquid water and a comprehensive set of 50 ice structures, we compute the phase diagram at three hybrid density-functional-theory levels of approximation, accounting for thermal and nuclear fluctuations as well as proton disorder. Such ca…
▽ More
The phase diagram of water harbours many mysteries: some of the phase boundaries are fuzzy, and the set of known stable phases may not be complete. Starting from liquid water and a comprehensive set of 50 ice structures, we compute the phase diagram at three hybrid density-functional-theory levels of approximation, accounting for thermal and nuclear fluctuations as well as proton disorder. Such calculations are only made tractable because we combine machine-learning methods and advanced free-energy techniques. The computed phase diagram is in qualitative agreement with experiment, particularly at pressures $\lesssim$8000 bar, and the discrepancy in chemical potential is comparable with the subtle uncertainties introduced by proton disorder and the spread between the three hybrid functionals. None of the hypothetical ice phases considered is thermodynamically stable in our calculations, suggesting the completeness of the experimental water phase diagram in the region considered. Our work demonstrates the feasibility of predicting the phase diagram of a polymorphic system from first principles and provides a thermodynamic way of testing the limits of quantum-mechanical calculations.
△ Less
Submitted 26 October, 2020;
originally announced October 2020.
-
THz range Faraday rotation in the Weyl Semimetal Candidate $\mathrm{Co_2TiGe}$
Authors:
Rishi Bhandia,
Bing Cheng,
Tobias Brown-Heft,
Shouvik Chatterjee,
Christopher J. Palmstrøm,
N. Peter Armitage
Abstract:
The $\mathrm{Co_2}$ family of ferromagnetic Heusler alloys have attracted interest due to their fully spin-polarized nature, making them ideal for applications in spintronic devices. More recently, the existence of room temperature time-reversal-breaking Weyl nodes near the Fermi level was predicted and confirmed in these systems. As a result of the presence of these Weyl nodes, these systems poss…
▽ More
The $\mathrm{Co_2}$ family of ferromagnetic Heusler alloys have attracted interest due to their fully spin-polarized nature, making them ideal for applications in spintronic devices. More recently, the existence of room temperature time-reversal-breaking Weyl nodes near the Fermi level was predicted and confirmed in these systems. As a result of the presence of these Weyl nodes, these systems possess a non-zero momentum space Berry curvature that can dramatically influence transport properties such as the anomalous Hall effect. One of these candidate compounds is $\mathrm{Co_2 Ti Ge}$. Recently, high quality molecular beam epitaxy-grown thin films of $\mathrm{Co_2 Ti Ge}$ have become available. In this work, we present THz-range measurement of MBE-grown $\mathrm{Co_2 Ti Ge}$ films. We measure the THz-range Faraday rotation, which can be understood as a measure of the anomalous Hall effect. We supplement this work with electronic band structure calculations showing that the principal contribution to the anomalous Hall effect in the this material stems from the Berry curvature of the material. Our work shows that this class of Heusler materials shows promise for Weyl semimetal based spintronics.
△ Less
Submitted 30 November, 2020; v1 submitted 16 October, 2020;
originally announced October 2020.
-
Verification of Group Non-membership by Shallow Quantum Circuits
Authors:
Kai Sun,
Zi-Jian Zhang,
Fei Meng,
Bin Cheng,
Zhu Cao,
**-Shi Xu,
Man-Hong Yung,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
Decision problems are the problems whose answer is either YES or NO. As the quantum analogue of $\mathsf{NP}$ (nondeterministic polynomial time), the class $\mathsf{QMA}$ (quantum Merlin-Arthur) contains the decision problems whose YES instance can be verified efficiently with a quantum computer. The problem of deciding the group non-membership (GNM) of a group element is known to be in…
▽ More
Decision problems are the problems whose answer is either YES or NO. As the quantum analogue of $\mathsf{NP}$ (nondeterministic polynomial time), the class $\mathsf{QMA}$ (quantum Merlin-Arthur) contains the decision problems whose YES instance can be verified efficiently with a quantum computer. The problem of deciding the group non-membership (GNM) of a group element is known to be in $\mathsf{QMA}$. Previous works on the verification of GNM required a quantum circuit with $O(n^5)$ group oracle calls. Here we propose an efficient way to verify GNM problems, reducing the circuit depth to $O(1)$ and the number of qubits by half. We further experimentally demonstrate the scheme, in which two-element subgroups in a four-element group are employed for the verification task. A significant completeness-soundness gap is observed in the experiment.
△ Less
Submitted 7 October, 2020;
originally announced October 2020.
-
Butterfly-like anisotropic magnetoresistance and angle-dependent Berry phase in Type-II Weyl semimetal WP2
Authors:
Kaixuan Zhang,
Yong** Du,
Pengdong Wang,
Laiming Wei,
Lin Li,
Qiang Zhang,
Wei Qin,
Zhiyong Lin,
Bin Cheng,
Yifan Wang,
Han Xu,
Xiaodong Fan,
Zhe Sun,
Xiangang Wan,
Changgan Zeng
Abstract:
Weyl semimetal emerges as a new topologically nontrivial phase of matter, hosting low-energy excitations of massless Weyl fermions. Here, we present a comprehensive study of the type-II Weyl semimetal WP2. Transport studies show a butterfly-like magnetoresistance at low temperature, reflecting the anisotropy of the electron Fermi surfaces. The four-lobed feature gradually evolves into a two-lobed…
▽ More
Weyl semimetal emerges as a new topologically nontrivial phase of matter, hosting low-energy excitations of massless Weyl fermions. Here, we present a comprehensive study of the type-II Weyl semimetal WP2. Transport studies show a butterfly-like magnetoresistance at low temperature, reflecting the anisotropy of the electron Fermi surfaces. The four-lobed feature gradually evolves into a two-lobed one upon increasing temperature, mainly due to the reduced relative contribution of electron Fermi surfaces compared to hole Fermi surfaces for the magnetoresistance. Moreover, angle-dependent Berry phase is further discovered from the quantum oscillations, which is ascribed to the effective manipulation of the extremal Fermi orbits by the magnetic field to feel the nearby topological singularities in the momentum space. The revealed topological characters and anisotropic Fermi surfaces of WP2 substantially enrich the physical properties of Weyl semimetals and hold great promises in topological electronic and Fermitronic device applications.
△ Less
Submitted 31 August, 2020;
originally announced August 2020.
-
N-Graphene Synthesized in Astrochemical Ices
Authors:
K K Rahul,
M Ambresh,
D Sahu,
J K Meka,
S -L Chou,
Y -J Wu,
D Gupta,
A Das,
J -I Lo,
B -M Cheng,
B N Raja Sekhar,
A Bhardwaj,
H Hill,
P Janardhan,
N J Mason,
B Sivaraman
Abstract:
Icy mantles of benzonitrile, an aromatic with a cyanide side chain that has recently been detected in the interstellar medium, were subjected to vacuum ultraviolet photon irradiation and found to form a residue. The residue was removed from the substrate and placed on a Quantifoil grid for electron microscopy analysis. Transmission electron microscopy showed Quantum Dot (QD) and Nitrogen-doped Gra…
▽ More
Icy mantles of benzonitrile, an aromatic with a cyanide side chain that has recently been detected in the interstellar medium, were subjected to vacuum ultraviolet photon irradiation and found to form a residue. The residue was removed from the substrate and placed on a Quantifoil grid for electron microscopy analysis. Transmission electron microscopy showed Quantum Dot (QD) and Nitrogen-doped Graphene (N-Graphene) sheets. Diffraction and Energy Dispersive X-ray Spectroscopy revealed the crystalline nature and carbon-nitrogen composition, of the observed graphene sheet. This is the first result showing QD and N-Graphene synthesis in ice irradiation at interstellar temperatures.
△ Less
Submitted 23 August, 2020;
originally announced August 2020.
-
$S^3$Net: Semantic-Aware Self-supervised Depth Estimation with Monocular Videos and Synthetic Data
Authors:
Bin Cheng,
Inderjot Singh Saggu,
Raunak Shah,
Gaurav Bansal,
Dinesh Bharadia
Abstract:
Solving depth estimation with monocular cameras enables the possibility of widespread use of cameras as low-cost depth estimation sensors in applications such as autonomous driving and robotics. However, learning such a scalable depth estimation model would require a lot of labeled data which is expensive to collect. There are two popular existing approaches which do not require annotated depth ma…
▽ More
Solving depth estimation with monocular cameras enables the possibility of widespread use of cameras as low-cost depth estimation sensors in applications such as autonomous driving and robotics. However, learning such a scalable depth estimation model would require a lot of labeled data which is expensive to collect. There are two popular existing approaches which do not require annotated depth maps: (i) using labeled synthetic and unlabeled real data in an adversarial framework to predict more accurate depth, and (ii) unsupervised models which exploit geometric structure across space and time in monocular video frames. Ideally, we would like to leverage features provided by both approaches as they complement each other; however, existing methods do not adequately exploit these additive benefits. We present $S^3$Net, a self-supervised framework which combines these complementary features: we use synthetic and real-world images for training while exploiting geometric, temporal, as well as semantic constraints. Our novel consolidated architecture provides a new state-of-the-art in self-supervised depth estimation using monocular videos. We present a unique way to train this self-supervised framework, and achieve (i) more than $15\%$ improvement over previous synthetic supervised approaches that use domain adaptation and (ii) more than $10\%$ improvement over previous self-supervised approaches which exploit geometric constraints from the real data.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Calibration of a superconducting gravimeter with an absolute atom gravimeter
Authors:
S. Merlet,
P. Gillot,
B. Cheng,
R. Karcher,
A. Imanaliev,
L. Timmen,
F. Pereira Dos Santos
Abstract:
We present a 27-days long common view measurement of an absolute cold atom gravimeter (CAG) and a relative iGrav superconducting gravimeter, which we use to calibrate the iGrav scale factor. This allowed us to push the CAG long-term stability down to the level of 0.5~nm.s$^{-2}$. We investigate the impact of the duration of the measurement on the uncertainty in the determination of the correlation…
▽ More
We present a 27-days long common view measurement of an absolute cold atom gravimeter (CAG) and a relative iGrav superconducting gravimeter, which we use to calibrate the iGrav scale factor. This allowed us to push the CAG long-term stability down to the level of 0.5~nm.s$^{-2}$. We investigate the impact of the duration of the measurement on the uncertainty in the determination of the correlation factor and show that it is limited to about 3\textperthousand~by the coloured noise of our cold atom gravimeter. A 3-days long measurement session with an additional FG5X absolute gravimeter allows us to directly compare the calibration results obtained with two different absolute meters. Based on our analysis, we expect that with an improvement of its long term stability, the CAG will allow to calibrate the iGrav scale factor to better than the per mille level (1$σ$ level of confidence) after only one-day of concurrent measurements for maximum tidal amplitudes.
△ Less
Submitted 15 February, 2021; v1 submitted 20 July, 2020;
originally announced July 2020.
-
Extracting ice phases from liquid water: why a machine-learning water model generalizes so well
Authors:
Bartomeu Monserrat,
Jan Gerit Brandenburg,
Edgar A. Engel,
Bingqing Cheng
Abstract:
We investigate the structural similarities between liquid water and 53 ices, including 20 knowncrystalline phases. We base such similarity comparison on the local environments that consist of atoms within a certain cutoff radius of a central atom. We reveal that liquid water explores the localenvironments of the diverse ice phases, by directly comparing the environments in these phases using gener…
▽ More
We investigate the structural similarities between liquid water and 53 ices, including 20 knowncrystalline phases. We base such similarity comparison on the local environments that consist of atoms within a certain cutoff radius of a central atom. We reveal that liquid water explores the localenvironments of the diverse ice phases, by directly comparing the environments in these phases using general atomic descriptors, and also by demonstrating that a machine-learning potential trained on liquid water alone can predict the densities, the lattice energies, and vibrational properties of theices. The finding that the local environments characterising the different ice phases are found in water sheds light on water phase behaviors, and rationalizes the transferability of water models between different phases.
△ Less
Submitted 23 June, 2020;
originally announced June 2020.
-
Unconventional free charge in the correlated semimetal Nd2Ir2O7
Authors:
K. Wang,
B. Xu,
C. W. Rischau,
N. Bachar,
B. Michon,
J. Teyssier,
Y. Qiu,
T. Ohtsuki,
Bing Cheng,
N. P. Armitage,
S. Nakatsuji,
D. van der Marel
Abstract:
Nd2Ir2O7 is a correlated semimetal with the pyrochlore structure, in which competing spin-orbit coupling and electron-electron interactions are believed to induce a time-reversal symmetry broken Weyl semimetal phase characterized by pairs of topologically protected Dirac points at the Fermi energy. However, the emergent properties in these materials are far from clear, and exotic new states of mat…
▽ More
Nd2Ir2O7 is a correlated semimetal with the pyrochlore structure, in which competing spin-orbit coupling and electron-electron interactions are believed to induce a time-reversal symmetry broken Weyl semimetal phase characterized by pairs of topologically protected Dirac points at the Fermi energy. However, the emergent properties in these materials are far from clear, and exotic new states of matter have been conjectured. Here we demonstrate optically that at low temperatures the free carrier spectral weight is proportional to T^2 where T is the temperature, as expected for massless Dirac electrons. However, we do {\em not} observe the corresponding T^3 term in the specific heat. That the system is not in a Fermi liquid state is further corroborated by the "Planckian" T-linear temperature dependence of the momentum relaxation rate and the progressive opening of a correlation-induced gap at low temperatures. These observations can not be reconciled within the framework of band theory of electron-like quasiparticles and point toward the effective decoupling of the charge transport from the single particle sector.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation
Authors:
Liang-Chieh Chen,
Raphael Gontijo Lopes,
Bowen Cheng,
Maxwell D. Collins,
Ekin D. Cubuk,
Barret Zoph,
Hartwig Adam,
Jonathon Shlens
Abstract:
Supervised learning in large discriminative models is a mainstay for modern computer vision. Such an approach necessitates investing in large-scale human-annotated datasets for achieving state-of-the-art results. In turn, the efficacy of supervised learning may be limited by the size of the human annotated dataset. This limitation is particularly notable for image segmentation tasks, where the exp…
▽ More
Supervised learning in large discriminative models is a mainstay for modern computer vision. Such an approach necessitates investing in large-scale human-annotated datasets for achieving state-of-the-art results. In turn, the efficacy of supervised learning may be limited by the size of the human annotated dataset. This limitation is particularly notable for image segmentation tasks, where the expense of human annotation is especially large, yet large amounts of unlabeled data may exist. In this work, we ask if we may leverage semi-supervised learning in unlabeled video sequences and extra images to improve the performance on urban scene segmentation, simultaneously tackling semantic, instance, and panoptic segmentation. The goal of this work is to avoid the construction of sophisticated, learned architectures specific to label propagation (e.g., patch matching and optical flow). Instead, we simply predict pseudo-labels for the unlabeled data and train subsequent models with both human-annotated and pseudo-labeled data. The procedure is iterated for several times. As a result, our Naive-Student model, trained with such simple yet effective iterative semi-supervised learning, attains state-of-the-art results at all three Cityscapes benchmarks, reaching the performance of 67.8% PQ, 42.6% AP, and 85.2% mIOU on the test set. We view this work as a notable step towards building a simple procedure to harness unlabeled video sequences and extra images to surpass state-of-the-art performance on core computer vision tasks.
△ Less
Submitted 19 July, 2020; v1 submitted 20 May, 2020;
originally announced May 2020.
-
Computing the heat conductivity of fluids from density fluctuations
Authors:
Bingqing Cheng,
Daan Frenkel
Abstract:
Equilibrium molecular dynamics simulations, in combination with the Green-Kubo (GK) method, have been extensively used to compute the thermal conductivity of liquids. However, the GK method relies on an ambiguous definition of the microscopic heat flux, which depends on how one chooses to distribute energies over atoms. This ambiguity makes it problematic to employ the GK method for systems with n…
▽ More
Equilibrium molecular dynamics simulations, in combination with the Green-Kubo (GK) method, have been extensively used to compute the thermal conductivity of liquids. However, the GK method relies on an ambiguous definition of the microscopic heat flux, which depends on how one chooses to distribute energies over atoms. This ambiguity makes it problematic to employ the GK method for systems with non-pairwise interactions. In this work, we show that the hydrodynamic description of thermally driven density fluctuations can be used to obtain the thermal conductivity of a bulk fluid unambiguously, thereby bypassing the need to define the heat flux. We verify that, for a model fluid with only pairwise interactions, our method yields estimates of thermal conductivity consistent with the GK approach. We apply our approach to compute the thermal conductivity of a non-pairwise additive water model at supercritical conditions, and then of a liquid hydrogen system described by a machine-learning interatomic potential, at 33 GPa and 2000 K.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.