-
Cache-Aided Massive MIMO with Linear Precoding in Multi-cell Systems
Authors:
Lin Xiang,
Xiao Wei,
Laura Cottatellucci,
Robert Schober,
Tao Jiang
Abstract:
In this paper, we propose a novel joint caching and massive multiple-input multiple-output (MIMO) transmission scheme, referred to as \emph{cache-aided massive MIMO}, for multi-cell downlink transmission to multiple cache-enabled receivers. With the proposed scheme, users who have cached (a portion of) the files that they request are offloaded and, hence, (partially) inactive during downlink trans…
▽ More
In this paper, we propose a novel joint caching and massive multiple-input multiple-output (MIMO) transmission scheme, referred to as \emph{cache-aided massive MIMO}, for multi-cell downlink transmission to multiple cache-enabled receivers. With the proposed scheme, users who have cached (a portion of) the files that they request are offloaded and, hence, (partially) inactive during downlink transmission. The other users either benefit from the cache-enabled offloading for mitigating pilot contamination or exploit the cached but unrequested files to cancel interference during uplink channel estimation and downlink file reception. Moreover, by redesigning the transmit precoders based on the cache status of the users and channel state information, we gain additional degrees of freedom for massive MIMO transmission. For a given cache status, we analyze the equivalent content delivery rates (ECDRs), i.e., the average rates of delivering a requested file via both caching and massive MIMO transmission to the requesting user, for cache-aided massive MIMO employing re-designed maximum ratio transmission (MRT), zero-forcing (ZF) precoding, and regularized zero-forcing (RZF) precoding. Based on the derived results, the impact of (random) uncoded caching and coded caching on the performance of the re-designed precoding schemes is investigated. Simulation results validate our derivations and show that caching is beneficial for precoded downlink transmission as it enhances the transmit power allocation, mitigates intra- and inter-cell interference, and reduces the impairment caused by pilot contamination. Compared with conventional massive MIMO without caching and with cache-oblivious precoding, the proposed cache-aided massive MIMO scheme achieves a significantly higher ECDR even when the number of users approaches the number of transmit antennas.
△ Less
Submitted 10 April, 2022;
originally announced April 2022.
-
Calibration Strategy of the JUNO-TAO Experiment
Authors:
Hangkun Xu,
Angel Abusleme,
Nikolay V. Anfimov,
Stéphane Callier,
Agustin Campeny,
Guofu Cao,
Jun Cao,
Cedric Cerna,
Yu Chen,
Alexander Chepurnov,
Yayun Ding,
Frederic Druillole,
Andrea Fabbri,
Zhengyong Fei,
Maxim Gromov,
Miao He,
Wei He,
Yuanqiang He,
Joseph yk Hor,
Shao**g Hou,
Jianrun Hu,
Jun Hu,
Cédric Huss,
Xiaolu Ji,
Tao Jiang
, et al. (46 additional authors not shown)
Abstract:
The Taishan Antineutrino Observatory (JUNO-TAO, or TAO) is a satellite detector for the Jiangmen Underground Neutrino Observatory (JUNO). Located near the Taishan reactor, TAO independently measures the reactor's antineutrino energy spectrum with unprecedented energy resolution. To achieve this goal, energy response must be well calibrated. Using the Automated Calibration Unit (ACU) and the Cable…
▽ More
The Taishan Antineutrino Observatory (JUNO-TAO, or TAO) is a satellite detector for the Jiangmen Underground Neutrino Observatory (JUNO). Located near the Taishan reactor, TAO independently measures the reactor's antineutrino energy spectrum with unprecedented energy resolution. To achieve this goal, energy response must be well calibrated. Using the Automated Calibration Unit (ACU) and the Cable Loop System (CLS) of TAO, multiple radioactive sources are deployed to various positions in the detector to perform a precise calibration of energy response. The non-linear energy response can be controlled within 0.6% with different energy points of these radioactive sources. It can be further improved by using $^{12}\rm B$ decay signals produced by cosmic muons. Through the energy non-uniformity calibration, residual non-uniformity is less than 0.2%. The energy resolution degradation and energy bias caused by the residual non-uniformity can be controlled within 0.05% and 0.3%, respectively. In addition, the stability of other detector parameters, such as the gain of each silicon photo-multiplier, can be monitored with a special ultraviolet LED calibration system.
△ Less
Submitted 29 May, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
An application of Pixel Interval Down-sampling (PID) for dense tiny microorganism counting on environmental microorganism images
Authors:
Jiawei Zhang,
Xin Zhao,
Tao Jiang,
Md Mamunur Rahaman,
Yudong Yao,
Yu-Hao Lin,
**ghua Zhang,
Ao Pan,
Marcin Grzegorzek,
Chen Li
Abstract:
This paper proposes a novel pixel interval down-sampling network (PID-Net) for dense tiny object (yeast cells) counting tasks with higher accuracy. The PID-Net is an end-to-end convolutional neural network (CNN) model with an encoder--decoder architecture. The pixel interval down-sampling operations are concatenated with max-pooling operations to combine the sparse and dense features. This address…
▽ More
This paper proposes a novel pixel interval down-sampling network (PID-Net) for dense tiny object (yeast cells) counting tasks with higher accuracy. The PID-Net is an end-to-end convolutional neural network (CNN) model with an encoder--decoder architecture. The pixel interval down-sampling operations are concatenated with max-pooling operations to combine the sparse and dense features. This addresses the limitation of contour conglutination of dense objects while counting. The evaluation was conducted using classical segmentation metrics (the Dice, Jaccard and Hausdorff distance) as well as counting metrics. The experimental results show that the proposed PID-Net had the best performance and potential for dense tiny object counting tasks, which achieved 96.97\% counting accuracy on the dataset with 2448 yeast cell images. By comparing with the state-of-the-art approaches, such as Attention U-Net, Swin U-Net and Trans U-Net, the proposed PID-Net can segment dense tiny objects with clearer boundaries and fewer incorrect debris, which shows the great potential of PID-Net in the task of accurate counting.
△ Less
Submitted 22 July, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
In situ characterization of vacancy ordering in Ge-Sb-Te phase-change memory alloys
Authors:
Ting-Ting Jiang,
Xu-Dong Wang,
Jiang-**g Wang,
Han-Yi Zhang,
Lu Lu,
Chunlin Jia,
Matthias Wuttig,
Riccardo Mazzarello,
Wei Zhang,
En Ma
Abstract:
Tailoring the degree of structural disorder in Ge-Sb-Te alloys is important for the development of non-volatile phase-change memory and neuro-inspired computing. Upon crystallization from the amorphous phase, these alloys form a cubic rocksalt-like structure with a high content of intrinsic vacancies. Further thermal annealing results in a gradual structural transition towards a layered structure…
▽ More
Tailoring the degree of structural disorder in Ge-Sb-Te alloys is important for the development of non-volatile phase-change memory and neuro-inspired computing. Upon crystallization from the amorphous phase, these alloys form a cubic rocksalt-like structure with a high content of intrinsic vacancies. Further thermal annealing results in a gradual structural transition towards a layered structure and an insulator-to-metal transition. In this work, we elucidate the atomic-level details of the structural transition in crystalline GeSb2Te4 by in situ high-resolution transmission electron microscopy (HRTEM) experiments and ab initio density functional theory (DFT) calculations, providing a comprehensive real-time and real-space view of the vacancy ordering process. We also discuss the impact of vacancy ordering on altering the electronic and optical properties of GeSb2Te4, which is relevant to multilevel storage applications. The phase evolution paths in Ge-Sb-Te alloys are illustrated using a summary diagram, which serves as a guide for designing phase-change memory devices.
△ Less
Submitted 24 September, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Learning Progressive Distributed Compression Strategies from Local Channel State Information
Authors:
Foad Sohrabi,
Tao Jiang,
Wei Yu
Abstract:
This paper proposes a deep learning framework to design distributed compression strategies in which distributed agents need to compress high-dimensional observations of a source, then send the compressed bits via bandwidth limited links to a fusion center for source reconstruction. Further, we require the compression strategy to be progressive so that it can adapt to the varying link bandwidths be…
▽ More
This paper proposes a deep learning framework to design distributed compression strategies in which distributed agents need to compress high-dimensional observations of a source, then send the compressed bits via bandwidth limited links to a fusion center for source reconstruction. Further, we require the compression strategy to be progressive so that it can adapt to the varying link bandwidths between the agents and the fusion center. Moreover, to ensure scalability, we investigate strategies that depend only on the local channel state information (CSI) at each agent. Toward this end, we use a data-driven approach in which the progressive linear combination and uniform quantization strategy at each agent are trained as a function of its local CSI. To deal with the challenges of modeling the quantization operations (which always produce zero gradients in the training of neural networks), we propose a novel approach of exploiting the statistics of the batch training data to set the dynamic ranges of the uniform quantizers. Numerically, we show that the proposed distributed estimation strategy designed with only local CSI can significantly reduce the signaling overhead and can achieve a lower mean-squared error distortion for source reconstruction than state-of-the-art designs that require global CSI at comparable overall communication cost.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
$A^{3}D$: A Platform of Searching for Robust Neural Architectures and Efficient Adversarial Attacks
Authors:
Jialiang Sun,
Wen Yao,
Tingsong Jiang,
Chao Li,
Xiaoqian Chen
Abstract:
The robustness of deep neural networks (DNN) models has attracted increasing attention due to the urgent need for security in many applications. Numerous existing open-sourced tools or platforms are developed to evaluate the robustness of DNN models by ensembling the majority of adversarial attack or defense algorithms. Unfortunately, current platforms do not possess the ability to optimize the ar…
▽ More
The robustness of deep neural networks (DNN) models has attracted increasing attention due to the urgent need for security in many applications. Numerous existing open-sourced tools or platforms are developed to evaluate the robustness of DNN models by ensembling the majority of adversarial attack or defense algorithms. Unfortunately, current platforms do not possess the ability to optimize the architectures of DNN models or the configuration of adversarial attacks to further enhance the robustness of models or the performance of adversarial attacks. To alleviate these problems, in this paper, we first propose a novel platform called auto adversarial attack and defense ($A^{3}D$), which can help search for robust neural network architectures and efficient adversarial attacks. In $A^{3}D$, we employ multiple neural architecture search methods, which consider different robustness evaluation metrics, including four types of noises: adversarial noise, natural noise, system noise, and quantified metrics, resulting in finding robust architectures. Besides, we propose a mathematical model for auto adversarial attack, and provide multiple optimization algorithms to search for efficient adversarial attacks. In addition, we combine auto adversarial attack and defense together to form a unified framework. Among auto adversarial defense, the searched efficient attack can be used as the new robustness evaluation to further enhance the robustness. In auto adversarial attack, the searched robust architectures can be utilized as the threat model to help find stronger adversarial attacks. Experiments on CIFAR10, CIFAR100, and ImageNet datasets demonstrate the feasibility and effectiveness of the proposed platform, which can also provide a benchmark and toolkit for researchers in the application of automated machine learning in evaluating and improving the DNN model robustnesses.
△ Less
Submitted 13 January, 2023; v1 submitted 6 March, 2022;
originally announced March 2022.
-
Efficient Distinction between Quantum Direct and Common Causes and its Experimental Verification
Authors:
Feixiang Xu,
Jia-Yi Lin,
Ben Wang,
Tao Jiang,
Shengjun Wu,
Wei Wang,
Lijian Zhang
Abstract:
Identifying the causal structures between two statistically correlated events has been widely investigated in many fields of science. While some of the well-studied classical methods are carefully generalized to quantum version of causal inference for certain cases, an effective and efficient way to detect the more general quantum causal structures is still lacking. Here, we introduce a quantity n…
▽ More
Identifying the causal structures between two statistically correlated events has been widely investigated in many fields of science. While some of the well-studied classical methods are carefully generalized to quantum version of causal inference for certain cases, an effective and efficient way to detect the more general quantum causal structures is still lacking. Here, we introduce a quantity named `Causal Determinant' to efficiently identify the quantum causal structures between two quantum systems and experimentally verify the validity of the method. According to the causal determinant, the quantum direct cause imposed by an arbitrary unitary operator can be perfectly discriminated with the quantum common cause, in which the two quantum systems share a joint quantum state. In addition, the causal determinant has the capability to discriminate between more general causal structures and predict the range of their parameters. The ability to detect more general quantum causal structures of our method can shed new light on the field of quantum causal inference.
△ Less
Submitted 5 March, 2022;
originally announced March 2022.
-
Acceleration of Federated Learning with Alleviated Forgetting in Local Training
Authors:
Chencheng Xu,
Zhiwei Hong,
Minlie Huang,
Tao Jiang
Abstract:
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy by independently training local models on each client and then aggregating parameters on a central server, thereby producing an effective global model. Although a variety of FL algorithms have been proposed, their training efficiency remains low when the data are not independently and ident…
▽ More
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy by independently training local models on each client and then aggregating parameters on a central server, thereby producing an effective global model. Although a variety of FL algorithms have been proposed, their training efficiency remains low when the data are not independently and identically distributed (non-i.i.d.) across different clients. We observe that the slow convergence rates of the existing methods are (at least partially) caused by the catastrophic forgetting issue during the local training stage on each individual client, which leads to a large increase in the loss function concerning the previous training data at the other clients. Here, we propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage by regularizing locally trained parameters with the loss on generated pseudo data, which encode the knowledge of previous training data learned by the global model. Our comprehensive experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep and the clients' data are extremely non-i.i.d., but is also able to protect privacy better in classification problems and more robust against gradient inversion attacks. The code is available at: https://github.com/Zoesgithub/FedReg.
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
Boundary Corrected Multi-scale Fusion Network for Real-time Semantic Segmentation
Authors:
Tianjiao Jiang,
Yi **,
Tengfei Liang,
Xu Wang,
Yidong Li
Abstract:
Image semantic segmentation aims at the pixel-level classification of images, which has requirements for both accuracy and speed in practical application. Existing semantic segmentation methods mainly rely on the high-resolution input to achieve high accuracy and do not meet the requirements of inference time. Although some methods focus on high-speed scene parsing with lightweight architectures,…
▽ More
Image semantic segmentation aims at the pixel-level classification of images, which has requirements for both accuracy and speed in practical application. Existing semantic segmentation methods mainly rely on the high-resolution input to achieve high accuracy and do not meet the requirements of inference time. Although some methods focus on high-speed scene parsing with lightweight architectures, they can not fully mine semantic features under low computation with relatively low performance. To realize the real-time and high-precision segmentation, we propose a new method named Boundary Corrected Multi-scale Fusion Network, which uses the designed Low-resolution Multi-scale Fusion Module to extract semantic information. Moreover, to deal with boundary errors caused by low-resolution feature map fusion, we further design an additional Boundary Corrected Loss to constrain overly smooth features. Extensive experiments show that our method achieves a state-of-the-art balance of accuracy and speed for the real-time semantic segmentation.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Cross-Task Knowledge Distillation in Multi-Task Recommendation
Authors:
Chenxiao Yang,
Junwei Pan,
Xiaofeng Gao,
Tingyu Jiang,
Dapeng Liu,
Guihai Chen
Abstract:
Multi-task learning (MTL) has been widely used in recommender systems, wherein predicting each type of user feedback on items (e.g, click, purchase) are treated as individual tasks and jointly trained with a unified model. Our key observation is that the prediction results of each task may contain task-specific knowledge about user's fine-grained preference towards items. While such knowledge coul…
▽ More
Multi-task learning (MTL) has been widely used in recommender systems, wherein predicting each type of user feedback on items (e.g, click, purchase) are treated as individual tasks and jointly trained with a unified model. Our key observation is that the prediction results of each task may contain task-specific knowledge about user's fine-grained preference towards items. While such knowledge could be transferred to benefit other tasks, it is being overlooked under the current MTL paradigm. This paper, instead, proposes a Cross-Task Knowledge Distillation framework that attempts to leverage prediction results of one task as supervised signals to teach another task. However, integrating MTL and KD in a proper manner is non-trivial due to several challenges including task conflicts, inconsistent magnitude and requirement of synchronous optimization. As countermeasures, we 1) introduce auxiliary tasks with quadruplet loss functions to capture cross-task fine-grained ranking information and avoid task conflicts, 2) design a calibrated distillation approach to align and distill knowledge from auxiliary tasks, and 3) propose a novel error correction mechanism to enable and facilitate synchronous training of teacher and student models. Comprehensive experiments are conducted to verify the effectiveness of our framework in real-world datasets.
△ Less
Submitted 27 March, 2022; v1 submitted 20 February, 2022;
originally announced February 2022.
-
Light-induced dimension crossover in 1T-TiSe$_2$ dictated by excitonic correlations
Authors:
Yun Cheng,
Alfred Zong,
Jun Li,
Wei Xia,
Shaofeng Duan,
Wenxuan Zhao,
Yidian Li,
Fengfeng Qi,
Jun Wu,
Lingrong Zhao,
Pengfei Zhu,
Xiao Zou,
Tao Jiang,
Yanfeng Guo,
Lexian Yang,
Dong Qian,
Wentao Zhang,
Anshul Kogar,
Michael W. Zuerch,
Dao Xiang,
Jie Zhang
Abstract:
In low-dimensional systems with strong electronic correlations, the application of an ultrashort laser pulse often yields novel phases that are otherwise inaccessible. The central challenge in understanding such phenomena is to determine how dimensionality and many-body correlations together govern the pathway of a non-adiabatic transition. To this end, we examine a layered compound, 1T-TiSe$_2$,…
▽ More
In low-dimensional systems with strong electronic correlations, the application of an ultrashort laser pulse often yields novel phases that are otherwise inaccessible. The central challenge in understanding such phenomena is to determine how dimensionality and many-body correlations together govern the pathway of a non-adiabatic transition. To this end, we examine a layered compound, 1T-TiSe$_2$, whose three-dimensional charge-density-wave (3D CDW) state also features exciton condensation due to strong electron-hole interactions. We find that photoexcitation suppresses the equilibrium 3D CDW while creating a nonequilibrium 2D CDW. Remarkably, the dimension reduction does not occur unless bound electron-hole pairs are broken. This relation suggests that excitonic correlations maintain the out-of-plane CDW coherence, settling a long-standing debate over their role in the CDW transition. Our findings demonstrate how optical manipulation of electronic interaction enables one to control the dimensionality of a broken-symmetry order, paving the way for realizing other emergent states in strongly correlated systems.
△ Less
Submitted 19 February, 2022;
originally announced February 2022.
-
A Comprehensive Survey with Quantitative Comparison of Image Analysis Methods for Microorganism Biovolume Measurements
Authors:
Jiawei Zhang,
Chen Li,
Md Mamunur Rahaman,
Yudong Yao,
**li Ma,
**ghua Zhang,
Xin Zhao,
Tao Jiang,
Marcin Grzegorzek
Abstract:
With the acceleration of urbanization and living standards, microorganisms play increasingly important roles in industrial production, bio-technique, and food safety testing. Microorganism biovolume measurements are one of the essential parts of microbial analysis. However, traditional manual measurement methods are time-consuming and challenging to measure the characteristics precisely. With the…
▽ More
With the acceleration of urbanization and living standards, microorganisms play increasingly important roles in industrial production, bio-technique, and food safety testing. Microorganism biovolume measurements are one of the essential parts of microbial analysis. However, traditional manual measurement methods are time-consuming and challenging to measure the characteristics precisely. With the development of digital image processing techniques, the characteristics of the microbial population can be detected and quantified. The changing trend can be adjusted in time and provided a basis for the improvement. The applications of the microorganism biovolume measurement method have developed since the 1980s. More than 62 articles are reviewed in this study, and the articles are grouped by digital image segmentation methods with periods. This study has high research significance and application value, which can be referred to microbial researchers to have a comprehensive understanding of microorganism biovolume measurements using digital image analysis methods and potential applications.
△ Less
Submitted 2 May, 2022; v1 submitted 17 February, 2022;
originally announced February 2022.
-
A Survey of Semen Quality Evaluation in Microscopic Videos Using Computer Assisted Sperm Analysis
Authors:
Wenwei Zhao,
**li Ma,
Chen Li,
Xiaoning Bu,
Shuojia Zou,
Tao Jiang,
Marcin Grzegorzek
Abstract:
The Computer Assisted Sperm Analysis (CASA) plays a crucial role in male reproductive health diagnosis and Infertility treatment. With the development of the computer industry in recent years, a great of accurate algorithms are proposed. With the assistance of those novel algorithms, it is possible for CASA to achieve a faster and higher quality result. Since image processing is the technical basi…
▽ More
The Computer Assisted Sperm Analysis (CASA) plays a crucial role in male reproductive health diagnosis and Infertility treatment. With the development of the computer industry in recent years, a great of accurate algorithms are proposed. With the assistance of those novel algorithms, it is possible for CASA to achieve a faster and higher quality result. Since image processing is the technical basis of CASA, including pre-processing,feature extraction, target detection and tracking, these methods are important technical steps in dealing with CASA. The various works related to Computer Assisted Sperm Analysis methods in the last 30 years (since 1988) are comprehensively introduced and analysed in this survey. To facilitate understanding, the methods involved are analysed in the sequence of general steps in sperm analysis. In other words, the methods related to sperm detection (localization) are first analysed, and then the methods of sperm tracking are analysed. Beside this, we analyse and prospect the present situation and future of CASA. According to our work, the feasible for applying in sperm microscopic video of methods mentioned in this review is explained. Moreover, existing challenges of object detection and tracking in microscope video are potential to be solved inspired by this survey.
△ Less
Submitted 17 February, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Towards Best Practice of Interpreting Deep Learning Models for EEG-based Brain Computer Interfaces
Authors:
Jian Cui,
Liqiang Yuan,
Zhaoxiang Wang,
Ruilin Li,
Tianzi Jiang
Abstract:
As deep learning has achieved state-of-the-art performance for many tasks of EEG-based BCI, many efforts have been made in recent years trying to understand what have been learned by the models. This is commonly done by generating a heatmap indicating to which extent each pixel of the input contributes to the final classification for a trained model. Despite the wide use, it is not yet understood…
▽ More
As deep learning has achieved state-of-the-art performance for many tasks of EEG-based BCI, many efforts have been made in recent years trying to understand what have been learned by the models. This is commonly done by generating a heatmap indicating to which extent each pixel of the input contributes to the final classification for a trained model. Despite the wide use, it is not yet understood to which extent the obtained interpretation results can be trusted and how accurate they can reflect the model decisions. In order to fill this research gap, we conduct a study to evaluate different deep interpretation techniques quantitatively on EEG datasets. The results reveal the importance of selecting a proper interpretation technique as the initial step. In addition, we also find that the quality of the interpretation results is inconsistent for individual samples despite when a method with an overall good performance is used. Many factors, including model structure and dataset types, could potentially affect the quality of the interpretation results. Based on the observations, we propose a set of procedures that allow the interpretation results to be presented in an understandable and trusted way. We illustrate the usefulness of our method for EEG-based BCI with instances selected from different scenarios.
△ Less
Submitted 17 April, 2023; v1 submitted 12 February, 2022;
originally announced February 2022.
-
Deep Monte Carlo Quantile Regression for Quantifying Aleatoric Uncertainty in Physics-informed Temperature Field Reconstruction
Authors:
Xiaohu Zheng,
Wen Yao,
Zhiqiang Gong,
Yunyang Zhang,
Xiaoyu Zhao,
Tingsong Jiang
Abstract:
For the temperature field reconstruction (TFR), a complex image-to-image regression problem, the convolutional neural network (CNN) is a powerful surrogate model due to the convolutional layer's good image feature extraction ability. However, a lot of labeled data is needed to train CNN, and the common CNN can not quantify the aleatoric uncertainty caused by data noise. In actual engineering, the…
▽ More
For the temperature field reconstruction (TFR), a complex image-to-image regression problem, the convolutional neural network (CNN) is a powerful surrogate model due to the convolutional layer's good image feature extraction ability. However, a lot of labeled data is needed to train CNN, and the common CNN can not quantify the aleatoric uncertainty caused by data noise. In actual engineering, the noiseless and labeled training data is hardly obtained for the TFR. To solve these two problems, this paper proposes a deep Monte Carlo quantile regression (Deep MC-QR) method for reconstructing the temperature field and quantifying aleatoric uncertainty caused by data noise. On the one hand, the Deep MC-QR method uses physical knowledge to guide the training of CNN. Thereby, the Deep MC-QR method can reconstruct an accurate TFR surrogate model without any labeled training data. On the other hand, the Deep MC-QR method constructs a quantile level image for each input in each training epoch. Then, the trained CNN model can quantify aleatoric uncertainty by quantile level image sampling during the prediction stage. Finally, the effectiveness of the proposed Deep MC-QR method is validated by many experiments, and the influence of data noise on TFR is analyzed.
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
A State-of-the-art Survey of U-Net in Microscopic Image Analysis: from Simple Usage to Structure Mortification
Authors:
Jian Wu,
Wanli Liu,
Chen Li,
Tao Jiang,
Islam Mohammad Shariful,
Hongzan Sun,
Xiaoqi Li,
Xintong Li,
Xinyu Huang,
Marcin Grzegorzek
Abstract:
Image analysis technology is used to solve the inadvertences of artificial traditional methods in disease, wastewater treatment, environmental change monitoring analysis and convolutional neural networks (CNN) play an important role in microscopic image analysis. An important step in detection, tracking, monitoring, feature extraction, modeling and analysis is image segmentation, in which U-Net ha…
▽ More
Image analysis technology is used to solve the inadvertences of artificial traditional methods in disease, wastewater treatment, environmental change monitoring analysis and convolutional neural networks (CNN) play an important role in microscopic image analysis. An important step in detection, tracking, monitoring, feature extraction, modeling and analysis is image segmentation, in which U-Net has increasingly applied in microscopic image segmentation. This paper comprehensively reviews the development history of U-Net, and analyzes various research results of various segmentation methods since the emergence of U-Net and conducts a comprehensive review of related papers. First, this paper has summarized the improved methods of U-Net and then listed the existing significance of image segmentation techniques and their improvements that has introduced over the years. Finally, focusing on the different improvement strategies of U-Net in different papers, the related work of each application target is reviewed according to detailed technical categories to facilitate future research. Researchers can clearly see the dynamics of transmission of technological development and keep up with future trends in this interdisciplinary field.
△ Less
Submitted 23 April, 2022; v1 submitted 13 February, 2022;
originally announced February 2022.
-
Nonreciprocal transport in a bilayer of MnBi2Te4 and Pt
Authors:
Chen Ye,
Xiangnan Xie,
Wenxing Lv3,
Ke Huang,
Allen Jian Yang,
Sicong Jiang,
Xue Liu,
Dapeng Zhu,
Xuepeng Qiu,
Mingyu Tong,
Tong Zhou,
Chuang-Han Hsu,
Guoqing Chang,
Hsin Lin,
Peisen Li,
Kesong Yang,
Zhenyu Wang,
Tian Jiang,
Xiao Renshaw Wang
Abstract:
MnBi2Te4 (MBT) is the first intrinsic magnetic topological insulator with the interaction of spin-momentum locked surface electrons and intrinsic magnetism, and it exhibits novel magnetic and topological phenomena. Recent studies suggested that the interaction of electrons and magnetism can be affected by the Mn-doped Bi2Te3 phase at the surface due to inevitable structural defects. Here we report…
▽ More
MnBi2Te4 (MBT) is the first intrinsic magnetic topological insulator with the interaction of spin-momentum locked surface electrons and intrinsic magnetism, and it exhibits novel magnetic and topological phenomena. Recent studies suggested that the interaction of electrons and magnetism can be affected by the Mn-doped Bi2Te3 phase at the surface due to inevitable structural defects. Here we report an observation of nonreciprocal transport, i.e. current-direction-dependent resistance, in a bilayer composed of antiferromagnetic MBT and nonmagnetic Pt. The emergence of the nonreciprocal response below the Néel temperature confirms a correlation between nonreciprocity and intrinsic magnetism in the surface state of MBT. The angular dependence of the nonreciprocal transport indicates that nonreciprocal response originates from the asymmetry scattering of electrons at the surface of MBT mediated by magnon. Our work provides an insight into nonreciprocity arising from the correlation between magnetism and Dirac surface electrons in intrinsic magnetic topological insulators.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Authors:
Stephen H. Bach,
Victor Sanh,
Zheng-Xin Yong,
Albert Webson,
Colin Raffel,
Nihal V. Nayak,
Abheesht Sharma,
Taewoon Kim,
M Saiful Bari,
Thibault Fevry,
Zaid Alyafeai,
Manan Dey,
Andrea Santilli,
Zhiqing Sun,
Srulik Ben-David,
Canwen Xu,
Gunjan Chhablani,
Han Wang,
Jason Alan Fries,
Maged S. Al-shaibani,
Shanya Sharma,
Urmish Thakker,
Khalid Almubarak,
Xiangru Tang,
Dragomir Radev
, et al. (2 additional authors not shown)
Abstract:
PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges…
▽ More
PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges in this new setting with (1) a templating language for defining data-linked prompts, (2) an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and (3) a community-driven set of guidelines for contributing new prompts to a common pool. Over 2,000 prompts for roughly 170 datasets are already available in PromptSource. PromptSource is available at https://github.com/bigscience-workshop/promptsource.
△ Less
Submitted 29 March, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Tree-degenerate graphs and nested dependent random choice
Authors:
Tao Jiang,
Sean Longbrake
Abstract:
The celebrated dependent random choice lemma states that in a bipartite graph an average vertex (weighted by its degree) has the property that almost all small subsets $S$ in its neighborhood has common neighborhood almost as large as in the random graph of the same edge-density. Two well-known applications of the lemma are as follows. The first is a theorem of Füredi and of Alon, Krivelevich, and…
▽ More
The celebrated dependent random choice lemma states that in a bipartite graph an average vertex (weighted by its degree) has the property that almost all small subsets $S$ in its neighborhood has common neighborhood almost as large as in the random graph of the same edge-density. Two well-known applications of the lemma are as follows. The first is a theorem of Füredi and of Alon, Krivelevich, and Sudakov showing that the maximum number of edges in an $n$-vertex graph not containing a fixed bipartite graph with maximum degree at most $r$ on one side is $O(n^{2-1/r})$. This was recently extended by Grzesik, Janzer and Nagy to the family of so-called $(r,t)$-blowups of a tree. A second application is a theorem of Conlon, Fox, and Sudakov, confirming a special case of a conjecture of Erdős and Simonovits and of Sidorenko, showing that if $H$ is a bipartite graph that contains a vertex complete to the other part and $G$ is a graph then the probability that the uniform random map** from $V(H)$ to $V(G)$ is a homomorphismis at least $\left[\frac{2|E(G)|}{|V(G)|^2}\right]^{|E(H)|}$.
In this note, we introduce a nested variant of the dependent random choice lemma, which might be of independent interest. We then apply it to obtain a common extension of the theorem of Conlon, Fox, and Sudakov and the theorem of Grzesik, Janzer, and Nagy, regarding Turán and Sidorenko properties of so-called tree-degenerate graphs.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
Forgery Attack Detection in Surveillance Video Streams Using Wi-Fi Channel State Information
Authors:
Yong Huang,
Xiang Li,
Wei Wang,
Tao Jiang,
Qian Zhang
Abstract:
The cybersecurity breaches expose surveillance video streams to forgery attacks, under which authentic streams are falsified to hide unauthorized activities. Traditional video forensics approaches can localize forgery traces using spatial-temporal analysis on relatively long video clips, while falling short in real-time forgery detection. The recent work correlates time-series camera and wireless…
▽ More
The cybersecurity breaches expose surveillance video streams to forgery attacks, under which authentic streams are falsified to hide unauthorized activities. Traditional video forensics approaches can localize forgery traces using spatial-temporal analysis on relatively long video clips, while falling short in real-time forgery detection. The recent work correlates time-series camera and wireless signals to detect looped videos but cannot realize fine-grained forgery localization. To overcome these limitations, we propose Secure-Pose, which exploits the pervasive coexistence of surveillance and Wi-Fi infrastructures to defend against video forgery attacks in a real-time and fine-grained manner. We observe that coexisting camera and Wi-Fi signals convey common human semantic information and forgery attacks on video streams will decouple such information correspondence. Particularly, retrievable human pose features are first extracted from concurrent video and Wi-Fi channel state information (CSI) streams. Then, a lightweight detection network is developed to accurately discover forgery attacks and an efficient localization algorithm is devised to seamlessly track forgery traces in video streams. We implement Secure-Pose using one Logitech camera and two Intel 5300 NICs and evaluate it in different environments. Secure-Pose achieves a high detection accuracy of 98.7% and localizes abnormal objects under playback and tampering attacks.
△ Less
Submitted 24 January, 2022;
originally announced January 2022.
-
Contrastive and Selective Hidden Embeddings for Medical Image Segmentation
Authors:
Zhuowei Li,
Zihao Liu,
Zhiqiang Hu,
Qing Xia,
Ruiqin Xiong,
Shaoting Zhang,
Dimitris Metaxas,
Tingting Jiang
Abstract:
Medical image segmentation has been widely recognized as a pivot procedure for clinical diagnosis, analysis, and treatment planning. However, the laborious and expensive annotation process lags down the speed of further advances. Contrastive learning-based weight pre-training provides an alternative by leveraging unlabeled data to learn a good representation. In this paper, we investigate how cont…
▽ More
Medical image segmentation has been widely recognized as a pivot procedure for clinical diagnosis, analysis, and treatment planning. However, the laborious and expensive annotation process lags down the speed of further advances. Contrastive learning-based weight pre-training provides an alternative by leveraging unlabeled data to learn a good representation. In this paper, we investigate how contrastive learning benefits the general supervised medical segmentation tasks. To this end, patch-dragsaw contrastive regularization (PDCR) is proposed to perform patch-level tugging and repulsing with the extent controlled by a continuous affinity score. And a new structure dubbed uncertainty-aware feature selection block (UAFS) is designed to perform the feature selection process, which can handle the learning target shift caused by minority features with high uncertainty. By plugging the proposed 2 modules into the existing segmentation architecture, we achieve state-of-the-art results across 8 public datasets from 6 domains. Newly designed modules further decrease the amount of training data to a quarter while achieving comparable, if not better, performances. From this perspective, we take the opposite direction of the original self/un-supervised contrastive learning by further excavating information contained within the label.
△ Less
Submitted 29 April, 2022; v1 submitted 21 January, 2022;
originally announced January 2022.
-
PromptBERT: Improving BERT Sentence Embeddings with Prompts
Authors:
Ting Jiang,
Jian Jiao,
Shaohan Huang,
Zihan Zhang,
Deqing Wang,
Fuzhen Zhuang,
Furu Wei,
Haizhen Huang,
Denvy Deng,
Qi Zhang
Abstract:
We propose PromptBERT, a novel contrastive learning method for learning better sentence representation. We firstly analyze the drawback of current sentence embedding from original BERT and find that it is mainly due to the static token embedding bias and ineffective BERT layers. Then we propose the first prompt-based sentence embeddings method and discuss two prompt representing methods and three…
▽ More
We propose PromptBERT, a novel contrastive learning method for learning better sentence representation. We firstly analyze the drawback of current sentence embedding from original BERT and find that it is mainly due to the static token embedding bias and ineffective BERT layers. Then we propose the first prompt-based sentence embeddings method and discuss two prompt representing methods and three prompt searching methods to make BERT achieve better sentence embeddings. Moreover, we propose a novel unsupervised training objective by the technology of template denoising, which substantially shortens the performance gap between the supervised and unsupervised settings. Extensive experiments show the effectiveness of our method. Compared to SimCSE, PromptBert achieves 2.29 and 2.58 points of improvement based on BERT and RoBERTa in the unsupervised setting.
△ Less
Submitted 13 October, 2022; v1 submitted 12 January, 2022;
originally announced January 2022.
-
gDNA: Towards Generative Detailed Neural Avatars
Authors:
Xu Chen,
Tianjian Jiang,
Jie Song,
**long Yang,
Michael J. Black,
Andreas Geiger,
Otmar Hilliges
Abstract:
To make 3D human avatars widely available, we must be able to generate a variety of 3D virtual humans with varied identities and shapes in arbitrary poses. This task is challenging due to the diversity of clothed body shapes, their complex articulations, and the resulting rich, yet stochastic geometric detail in clothing. Hence, current methods to represent 3D people do not provide a full generati…
▽ More
To make 3D human avatars widely available, we must be able to generate a variety of 3D virtual humans with varied identities and shapes in arbitrary poses. This task is challenging due to the diversity of clothed body shapes, their complex articulations, and the resulting rich, yet stochastic geometric detail in clothing. Hence, current methods to represent 3D people do not provide a full generative model of people in clothing. In this paper, we propose a novel method that learns to generate detailed 3D shapes of people in a variety of garments with corresponding skinning weights. Specifically, we devise a multi-subject forward skinning module that is learned from only a few posed, un-rigged scans per subject. To capture the stochastic nature of high-frequency details in garments, we leverage an adversarial loss formulation that encourages the model to capture the underlying statistics. We provide empirical evidence that this leads to realistic generation of local details such as wrinkles. We show that our model is able to generate natural human avatars wearing diverse and detailed clothing. Furthermore, we show that our method can be used on the task of fitting human models to raw scans, outperforming the previous state-of-the-art.
△ Less
Submitted 13 April, 2022; v1 submitted 11 January, 2022;
originally announced January 2022.
-
Interference Nulling Using Reconfigurable Intelligent Surface
Authors:
Tao Jiang,
Wei Yu
Abstract:
This paper investigates the interference nulling capability of reconfigurable intelligent surface (RIS) in a multiuser environment where multiple single-antenna transceivers communicate simultaneously in a shared spectrum. From a theoretical perspective, we show that when the channels between the RIS and the transceivers have line-of-sight and the direct paths are blocked, it is possible to adjust…
▽ More
This paper investigates the interference nulling capability of reconfigurable intelligent surface (RIS) in a multiuser environment where multiple single-antenna transceivers communicate simultaneously in a shared spectrum. From a theoretical perspective, we show that when the channels between the RIS and the transceivers have line-of-sight and the direct paths are blocked, it is possible to adjust the phases of the RIS elements to null out all the interference completely and to achieve the maximum $K$ degrees-of-freedom (DoF) in the overall $K$-user interference channel, provided that the number of RIS elements exceeds some finite value that depends on $K$. Algorithmically, for any fixed channel realization we formulate the interference nulling problem as a feasibility problem, and propose an alternating projection algorithm to efficiently solve the resulting nonconvex problem with local convergence guarantee. Numerical results show that the proposed alternating projection algorithm can null all the interference if the number of RIS elements is only slightly larger than a threshold of $2K(K-1)$. For the practical sum-rate maximization objective, this paper proposes to use the zero-forcing solution obtained from alternating projection as an initial point for subsequent Riemannian conjugate gradient optimization and shows that it has a significant performance advantage over random initializations. For the objective of maximizing the minimum rate, this paper proposes a subgradient projection method which is capable of achieving excellent performance at low complexity.
△ Less
Submitted 27 January, 2022; v1 submitted 25 December, 2021;
originally announced December 2021.
-
Measurement of $e^{+}e^{-}\toφπ^{+}π^{-}$ cross sections at center-of-mass energies from 2.00 to 3.08 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (589 additional authors not shown)
Abstract:
Using data corresponding to an integrated luminosity of $651~\mathrm{pb}^{-1}$ accumulated at 22 center-of-mass energies from 2.00 to 3.08 GeV by the BESIII experiment, the process $e^{+}e^{-}\toφπ^{+}π^{-}$ is studied. The cross sections for $e^{+}e^{-}\toφπ^{+}π^{-}$ are consistent with previous results, but with improved precision. To measure the mass and width of the structure observed in the…
▽ More
Using data corresponding to an integrated luminosity of $651~\mathrm{pb}^{-1}$ accumulated at 22 center-of-mass energies from 2.00 to 3.08 GeV by the BESIII experiment, the process $e^{+}e^{-}\toφπ^{+}π^{-}$ is studied. The cross sections for $e^{+}e^{-}\toφπ^{+}π^{-}$ are consistent with previous results, but with improved precision. To measure the mass and width of the structure observed in the cross section line shape, a combine fit is performed after enhancing the contribution from $φf_{0}(980)$. The fit reveals a structure with the mass of $M=2178\pm20\pm5~{\rm MeV}/c^2$ and the width of $\varGamma=140\pm36\pm16~{\rm MeV}$, where the first uncertainties are statistical and the second ones are systematic.
△ Less
Submitted 17 August, 2023; v1 submitted 25 December, 2021;
originally announced December 2021.
-
Characteristic polynomials and finitely dimensional representations of $\mathfrak{sl}(2, \mathbb{C})$
Authors:
Tianyi Jiang,
Shoumin Liu
Abstract:
In this paper, we obtain a general formula for the characteristic polynomial of a finitely dimensional representation of Lie algebra $\mathfrak{sl}(2, \C )$ and the form for these characteristic polynomials, and prove there is one to one correspondence between representations and their characteristic polynomials. We define a product on these characteristic polynomials, endowing them with a monoid…
▽ More
In this paper, we obtain a general formula for the characteristic polynomial of a finitely dimensional representation of Lie algebra $\mathfrak{sl}(2, \C )$ and the form for these characteristic polynomials, and prove there is one to one correspondence between representations and their characteristic polynomials. We define a product on these characteristic polynomials, endowing them with a monoid structure.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
EMDS-6: Environmental Microorganism Image Dataset Sixth Version for Image Denoising, Segmentation, Feature Extraction, Classification and Detection Methods Evaluation
Authors:
Peng Zhao,
Chen Li,
Md Mamunur Rahaman,
Hao Xu,
**li Ma,
Hechen Yang,
Hongzan Sun,
Tao Jiang,
Ning Xu,
Marcin Grzegorzek
Abstract:
Environmental microorganisms (EMs) are ubiquitous around us and have an important impact on the survival and development of human society. However, the high standards and strict requirements for the preparation of environmental microorganism (EM) data have led to the insufficient of existing related databases, not to mention the databases with GT images. This problem seriously affects the progress…
▽ More
Environmental microorganisms (EMs) are ubiquitous around us and have an important impact on the survival and development of human society. However, the high standards and strict requirements for the preparation of environmental microorganism (EM) data have led to the insufficient of existing related databases, not to mention the databases with GT images. This problem seriously affects the progress of related experiments. Therefore, This study develops the Environmental Microorganism Dataset Sixth Version (EMDS-6), which contains 21 types of EMs. Each type of EM contains 40 original and 40 GT images, in total 1680 EM images. In this study, in order to test the effectiveness of EMDS-6. We choose the classic algorithms of image processing methods such as image denoising, image segmentation and target detection. The experimental result shows that EMDS-6 can be used to evaluate the performance of image denoising, image segmentation, image feature extraction, image classification, and object detection methods.
△ Less
Submitted 25 April, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Non-Asymptotic Analysis of Online Multiplicative Stochastic Gradient Descent
Authors:
Riddhiman Bhattacharya,
Tiefeng Jiang
Abstract:
Past research has indicated that the covariance of the Stochastic Gradient Descent (SGD) error done via minibatching plays a critical role in determining its regularization and escape from low potential points. Motivated by some new research in this area, we prove universality results by showing that noise classes that have the same mean and covariance structure of SGD via minibatching have simila…
▽ More
Past research has indicated that the covariance of the Stochastic Gradient Descent (SGD) error done via minibatching plays a critical role in determining its regularization and escape from low potential points. Motivated by some new research in this area, we prove universality results by showing that noise classes that have the same mean and covariance structure of SGD via minibatching have similar properties. We mainly consider the Multiplicative Stochastic Gradient Descent (M-SGD) algorithm as introduced in previous work, which has a much more general noise class than the SGD algorithm done via minibatching. We establish non asymptotic bounds for the M-SGD algorithm in the Wasserstein distance. We also show that the M-SGD error is approximately a scaled Gaussian distribution with mean $0$ at any fixed point of the M-SGD algorithm.
△ Less
Submitted 1 March, 2023; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Active Sensing for Communications by Learning
Authors:
Foad Sohrabi,
Tao Jiang,
Wei Cui,
Wei Yu
Abstract:
This paper proposes a deep learning approach to a class of active sensing problems in wireless communications in which an agent sequentially interacts with an environment over a predetermined number of time frames to gather information in order to perform a sensing or actuation task for maximizing some utility function. In such an active learning setting, the agent needs to design an adaptive sens…
▽ More
This paper proposes a deep learning approach to a class of active sensing problems in wireless communications in which an agent sequentially interacts with an environment over a predetermined number of time frames to gather information in order to perform a sensing or actuation task for maximizing some utility function. In such an active learning setting, the agent needs to design an adaptive sensing strategy sequentially based on the observations made so far. To tackle such a challenging problem in which the dimension of historical observations increases over time, we propose to use a long short-term memory (LSTM) network to exploit the temporal correlations in the sequence of observations and to map each observation to a fixed-size state information vector. We then use a deep neural network (DNN) to map the LSTM state at each time frame to the design of the next measurement step. Finally, we employ another DNN to map the final LSTM state to the desired solution. We investigate the performance of the proposed framework for adaptive channel sensing problems in wireless communications. In particular, we consider the adaptive beamforming problem for mmWave beam alignment and the adaptive reconfigurable intelligent surface sensing problem for reflection alignment. Numerical results demonstrate that the proposed deep active sensing strategy outperforms the existing adaptive or nonadaptive sensing schemes.
△ Less
Submitted 8 February, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Disordered transmission-line networks with and without parity symmetry
Authors:
Tianshu Jiang,
C. T. Chan
Abstract:
Topological states are useful because they are robust against disorder and imperfection. In this study, we consider the effect of disorder and the breaking of parity symmetry on a topological network system in which the edge states are protected by Chern numbers. In the absence of periodicity, the local Chern number is adopted to characterize the topological features of the network. Our numerical…
▽ More
Topological states are useful because they are robust against disorder and imperfection. In this study, we consider the effect of disorder and the breaking of parity symmetry on a topological network system in which the edge states are protected by Chern numbers. In the absence of periodicity, the local Chern number is adopted to characterize the topological features of the network. Our numerical results show that the local Chern number and the edge states are very robust against onsite disorder as long as the gap of the bulk state continuum remains open and survives even when the bulk band gap is closed. Breaking the parity symmetry can destroy the quantization of local Chern numbers, compromising the existence of edge modes. We observed non-integer local Chern number peaks that are non-zero inside the bulk bands but these non-zero non-integral local Chern numbers are not associated with the existence of robust edge states.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
Intermediate Mass-Ratio Inspirals with Dark Matter Minispikes
Authors:
Ning Dai,
Yungui Gong,
Tong Jiang,
Dicong Liang
Abstract:
The dark matter (DM) distributed around an intermediate massive black hole (IMBH) forms an overdensity region called DM minispike. We consider the binary system which consists of an IMBH with DM minispike and a small black hole inspiralling around the IMBH in eccentric orbits. The factors which affect the evolution of the orbit include the gravity of the system, the dynamical friction and accretio…
▽ More
The dark matter (DM) distributed around an intermediate massive black hole (IMBH) forms an overdensity region called DM minispike. We consider the binary system which consists of an IMBH with DM minispike and a small black hole inspiralling around the IMBH in eccentric orbits. The factors which affect the evolution of the orbit include the gravity of the system, the dynamical friction and accretion of the small black hole caused by the DM minispike, and the radiation reaction of gravitational waves (GWs). Using the method of osculating orbit, we find that when the semilatus rectum p<<10^5 Rs (Rs is the Schwarzschild radius of the IMBH) the dominated factors are the dynamical friction and accretion from the DM minispike, and the radiation reaction. When p>>10^5 Rs, the gravity from the DM minispike dominates the orbital evolution. The existence of DM minispike leads to the deviation from the Keplerian orbit, such as extra orbital precession, henceforth extra phase shift in the GW waveform. By calculating the signal-to-noise ratio for GWs with and without DM minispikes and the mismatch between them, we show that the effect of the DM minispike in GW waveforms can potentially be detected by future space-based GW detectors such as LISA, Taiji, and Tianqin.
△ Less
Submitted 21 August, 2022; v1 submitted 26 November, 2021;
originally announced November 2021.
-
Error mitigation in variational quantum eigensolvers using tailored probabilistic machine learning
Authors:
Tao Jiang,
John Rogers,
Marius S. Frank,
Ove Christiansen,
Yong-Xin Yao,
Nicola Lanatà
Abstract:
Quantum computing technology has the potential to revolutionize the simulation of materials and molecules in the near future. A primary challenge in achieving near-term quantum advantage is effectively mitigating the noise effects inherent in current quantum processing units (QPUs). This challenge is also decisive in the context of quantum-classical hybrid schemes employing variational quantum eig…
▽ More
Quantum computing technology has the potential to revolutionize the simulation of materials and molecules in the near future. A primary challenge in achieving near-term quantum advantage is effectively mitigating the noise effects inherent in current quantum processing units (QPUs). This challenge is also decisive in the context of quantum-classical hybrid schemes employing variational quantum eigensolvers (VQEs) that have attracted significant interest in recent years. In this work, we present a novel method that employs parametric Gaussian process regression (GPR) within an active learning framework to mitigate noise in quantum computations, focusing on VQEs. Our approach, grounded in probabilistic machine learning, exploits a custom prior based on the VQE ansatz to capture the underlying correlations between VQE outputs for different variational parameters, thereby enhancing both accuracy and efficiency. We demonstrate the effectiveness of our method on a 2-site Anderson impurity model and a 8-site Heisenberg model, using the IBM open-source quantum computing framework, Qiskit, showcasing substantial improvements in the accuracy of VQE outputs while reducing the number of direct QPU energy evaluations. This work contributes to the ongoing efforts in quantum error mitigation and optimization, bringing us a step closer to realizing the potential of quantum computing in quantum matter simulations.
△ Less
Submitted 13 January, 2024; v1 submitted 16 November, 2021;
originally announced November 2021.
-
Lattice points in stretched finite type domains
Authors:
J. Guo,
T. Jiang
Abstract:
We study an optimal stretching problem, which is a variant lattice point problem, for convex domains in $\mathbb{R}^d$ ($d\geq 2$) with smooth boundary of finite type that are symmetric with respect to each coordinate hyperplane/axis. We prove that optimal domains which contain the most positive (or least nonnegative) lattice points are asymptotically balanced.
We study an optimal stretching problem, which is a variant lattice point problem, for convex domains in $\mathbb{R}^d$ ($d\geq 2$) with smooth boundary of finite type that are symmetric with respect to each coordinate hyperplane/axis. We prove that optimal domains which contain the most positive (or least nonnegative) lattice points are asymptotically balanced.
△ Less
Submitted 11 November, 2021;
originally announced November 2021.
-
Deciphering the Language of Nature: A transformer-based language model for deleterious mutations in proteins
Authors:
Theodore Jiang,
Li Fang,
Kai Wang
Abstract:
Various machine-learning models, including deep neural network models, have already been developed to predict deleteriousness of missense (non-synonymous) mutations. Potential improvements to the current state of the art, however, may still benefit from a fresh look at the biological problem using more sophisticated self-adaptive machine-learning approaches. Recent advances in the natural language…
▽ More
Various machine-learning models, including deep neural network models, have already been developed to predict deleteriousness of missense (non-synonymous) mutations. Potential improvements to the current state of the art, however, may still benefit from a fresh look at the biological problem using more sophisticated self-adaptive machine-learning approaches. Recent advances in the natural language processing field show transformer models-a type of deep neural network-to be particularly powerful at modeling sequence information with context dependence. In this study, we introduce MutFormer, a transformer-based model for the prediction of deleterious missense mutations, which uses reference and mutated protein sequences from the human genome as the primary features. MutFormer takes advantage of a combination of self-attention layers and convolutional layers to learn both long-range and short-range dependencies between amino acid mutations in a protein sequence. In this study, we first pre-trained MutFormer on reference protein sequences and mutated protein sequences resulting from common genetic variants observed in human populations. We next examined different fine-tuning methods to successfully apply the model to deleteriousness prediction of missense mutations. Finally, we evaluated MutFormer's performance on multiple testing data sets. We found that MutFormer showed similar or improved performance over a variety of existing tools, including those that used conventional machine-learning approaches. We conclude that MutFormer successfully considers sequence features that are not explored in previous studies and could potentially complement existing computational predictions or empirically generated functional scores to improve our understanding of disease variants.
△ Less
Submitted 9 February, 2023; v1 submitted 27 October, 2021;
originally announced October 2021.
-
Improving Non-autoregressive Generation with Mixup Training
Authors:
Ting Jiang,
Shaohan Huang,
Zihan Zhang,
Deqing Wang,
Fuzhen Zhuang,
Furu Wei,
Haizhen Huang,
Liangjie Zhang,
Qi Zhang
Abstract:
While pre-trained language models have achieved great success on various natural language understanding tasks, how to effectively leverage them into non-autoregressive generation tasks remains a challenge. To solve this problem, we present a non-autoregressive generation model based on pre-trained transformer models. To bridge the gap between autoregressive and non-autoregressive models, we propos…
▽ More
While pre-trained language models have achieved great success on various natural language understanding tasks, how to effectively leverage them into non-autoregressive generation tasks remains a challenge. To solve this problem, we present a non-autoregressive generation model based on pre-trained transformer models. To bridge the gap between autoregressive and non-autoregressive models, we propose a simple and effective iterative training method called MIx Source and pseudo Target (MIST). Unlike other iterative decoding methods, which sacrifice the inference speed to achieve better performance based on multiple decoding iterations, MIST works in the training stage and has no effect on inference time. Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results for fully non-autoregressive models. We also demonstrate that our method can be used to a variety of pre-trained models. For instance, MIST based on the small pre-trained model also obtains comparable performance with seq2seq models.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Authors:
Xiaobin Zhuang,
Huiran Yu,
Weifeng Zhao,
Tao Jiang,
Peng Hu
Abstract:
An automatic pitch correction system typically includes several stages, such as pitch extraction, deviation estimation, pitch shift processing, and cross-fade smoothing. However, designing these components with strategies often requires domain expertise and they are likely to fail on corner cases. In this paper, we present KaraTuner, an end-to-end neural architecture that predicts pitch curve and…
▽ More
An automatic pitch correction system typically includes several stages, such as pitch extraction, deviation estimation, pitch shift processing, and cross-fade smoothing. However, designing these components with strategies often requires domain expertise and they are likely to fail on corner cases. In this paper, we present KaraTuner, an end-to-end neural architecture that predicts pitch curve and resynthesizes the singing voice directly from the tuned pitch and vocal spectrum extracted from the original recordings. Several vital technical points have been introduced in KaraTuner to ensure pitch accuracy, pitch naturalness, timbre consistency, and sound quality. A feed-forward Transformer is employed in the pitch predictor to capture longterm dependencies in the vocal spectrum and musical note. We also develop a pitch-controllable vocoder based on a novel source-filter block and the Fre-GAN architecture. KaraTuner obtains a higher preference than the rule-based pitch correction approach through A/B tests, and perceptual experiments show that the proposed vocoder achieves significant advantages in timbre consistency and sound quality compared with the parametric WORLD vocoder, phase vocoder and CLPC vocoder.
△ Less
Submitted 26 June, 2022; v1 submitted 18 October, 2021;
originally announced October 2021.
-
ASFormer: Transformer for Action Segmentation
Authors:
Fangqiu Yi,
Hongyu Wen,
Tingting Jiang
Abstract:
Algorithms for the action segmentation task typically use temporal models to predict what action is occurring at each frame for a minute-long daily activity. Recent studies have shown the potential of Transformer in modeling the relations among elements in sequential data. However, there are several major concerns when directly applying the Transformer to the action segmentation task, such as the…
▽ More
Algorithms for the action segmentation task typically use temporal models to predict what action is occurring at each frame for a minute-long daily activity. Recent studies have shown the potential of Transformer in modeling the relations among elements in sequential data. However, there are several major concerns when directly applying the Transformer to the action segmentation task, such as the lack of inductive biases with small training sets, the deficit in processing long input sequence, and the limitation of the decoder architecture to utilize temporal relations among multiple action segments to refine the initial predictions. To address these concerns, we design an efficient Transformer-based model for action segmentation task, named ASFormer, with three distinctive characteristics: (i) We explicitly bring in the local connectivity inductive priors because of the high locality of features. It constrains the hypothesis space within a reliable scope, and is beneficial for the action segmentation task to learn a proper target function with small training sets. (ii) We apply a pre-defined hierarchical representation pattern that efficiently handles long input sequences. (iii) We carefully design the decoder to refine the initial predictions from the encoder. Extensive experiments on three public datasets demonstrate that effectiveness of our methods. Code is available at \url{https://github.com/ChinaYi/ASFormer}.
△ Less
Submitted 16 October, 2021;
originally announced October 2021.
-
Multitask Prompted Training Enables Zero-Shot Task Generalization
Authors:
Victor Sanh,
Albert Webson,
Colin Raffel,
Stephen H. Bach,
Lintang Sutawika,
Zaid Alyafeai,
Antoine Chaffin,
Arnaud Stiegler,
Teven Le Scao,
Arun Raja,
Manan Dey,
M Saiful Bari,
Canwen Xu,
Urmish Thakker,
Shanya Sharma Sharma,
Eliza Szczechla,
Taewoon Kim,
Gunjan Chhablani,
Nihal Nayak,
Debajyoti Datta,
Jonathan Chang,
Mike Tian-Jian Jiang,
Han Wang,
Matteo Manica,
Sheng Shen
, et al. (16 additional authors not shown)
Abstract:
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale,…
▽ More
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily map** any natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts with diverse wording. These prompted datasets allow for benchmarking the ability of a model to perform completely held-out tasks. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models up to 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6x its size. All trained models are available at https://github.com/bigscience-workshop/t-zero and all prompts are available at https://github.com/bigscience-workshop/promptsource.
△ Less
Submitted 17 March, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
EMDS-7: Environmental Microorganism Image Dataset Seventh Version for Multiple Object Detection Evaluation
Authors:
Hechen Yang,
Chen Li,
Xin Zhao,
Bencheng Cai,
Jiawei Zhang,
**li Ma,
Peng Zhao,
Ao Chen,
Tao Jiang,
Hongzan Sun,
Yueyang Teng,
Shouliang Qi,
Tao Jiang,
Marcin Grzegorzek
Abstract:
The Environmental Microorganism Image Dataset Seventh Version (EMDS-7) is a microscopic image data set, including the original Environmental Microorganism images (EMs) and the corresponding object labeling files in ".XML" format file. The EMDS-7 data set consists of 41 types of EMs, which has a total of 2365 images and 13216 labeled objects. The EMDS-7 database mainly focuses on the object detecti…
▽ More
The Environmental Microorganism Image Dataset Seventh Version (EMDS-7) is a microscopic image data set, including the original Environmental Microorganism images (EMs) and the corresponding object labeling files in ".XML" format file. The EMDS-7 data set consists of 41 types of EMs, which has a total of 2365 images and 13216 labeled objects. The EMDS-7 database mainly focuses on the object detection. In order to prove the effectiveness of EMDS-7, we select the most commonly used deep learning methods (Faster-RCNN, YOLOv3, YOLOv4, SSD and RetinaNet) and evaluation indices for testing and evaluation. EMDS-7 is freely published for non-commercial purpose at: https://figshare.com/articles/dataset/EMDS-7_DataSet/16869571
△ Less
Submitted 28 October, 2021; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Scattering-coded elastic meta-boundary
Authors:
Tianxi Jiang,
Xinxin Liao,
Hao Huang,
Zhi-Ke Peng,
Qingbo He
Abstract:
Object localization through active elastic waves is a crucial technology, but generally requires a transducer array with complex hardware. Although computational sensing has been demonstrated to be able to overcome the short-comings of transducer array by merging artificially designed structures into sensing process, coding spatial elastic waves for active object identification is still a knowledg…
▽ More
Object localization through active elastic waves is a crucial technology, but generally requires a transducer array with complex hardware. Although computational sensing has been demonstrated to be able to overcome the short-comings of transducer array by merging artificially designed structures into sensing process, coding spatial elastic waves for active object identification is still a knowledge gap. Here we propose a scattering-coded elastic meta-boundary composed of randomly distributed scatterers for computational identification of objects with a single transducer. The multiple scattering effect of the meta-boundary introduces complexity into scattered fields to achieve a highly uncorrelated scattering coding of elastic waves, thereby eliminating the ambiguity of the object location information. We demonstrate that the locations of objects can be uniquely identified by using the scattering coding of our designed meta-boundary, delivering a design of meta-boundary touchscreen for human-machine interaction. The proposed scattering-coded meta-boundary opens up avenues for artificially designed boundaries with the capability of information coding and identification, and may provide important applications in wave sensing, such as structural monitoring, underwater detection, indoor localization, and biomedical imaging.
△ Less
Submitted 30 September, 2021;
originally announced October 2021.
-
FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack
Authors:
Donghua Wang,
Tingsong Jiang,
Jialiang Sun,
Weien Zhou,
Xiaoya Zhang,
Zhiqiang Gong,
Wen Yao,
Xiaoqian Chen
Abstract:
Physical adversarial attacks in object detection have attracted increasing attention. However, most previous works focus on hiding the objects from the detector by generating an individual adversarial patch, which only covers the planar part of the vehicle's surface and fails to attack the detector in physical scenarios for multi-view, long-distance and partially occluded objects. To bridge the ga…
▽ More
Physical adversarial attacks in object detection have attracted increasing attention. However, most previous works focus on hiding the objects from the detector by generating an individual adversarial patch, which only covers the planar part of the vehicle's surface and fails to attack the detector in physical scenarios for multi-view, long-distance and partially occluded objects. To bridge the gap between digital attacks and physical attacks, we exploit the full 3D vehicle surface to propose a robust Full-coverage Camouflage Attack (FCA) to fool detectors. Specifically, we first try rendering the nonplanar camouflage texture over the full vehicle surface. To mimic the real-world environment conditions, we then introduce a transformation function to transfer the rendered camouflaged vehicle into a photo realistic scenario. Finally, we design an efficient loss function to optimize the camouflage texture. Experiments show that the full-coverage camouflage attack can not only outperform state-of-the-art methods under various test cases but also generalize to different environments, vehicles, and object detectors. The code of FCA will be available at: https://idrl-lab.github.io/Full-coverage-camouflage-adversarial-attack/.
△ Less
Submitted 11 December, 2021; v1 submitted 15 September, 2021;
originally announced September 2021.
-
Robustness and Generalization via Generative Adversarial Training
Authors:
Omid Poursaeed,
Tianxing Jiang,
Harry Yang,
Serge Belongie,
SerNam Lim
Abstract:
While deep neural networks have achieved remarkable success in various computer vision tasks, they often fail to generalize to new domains and subtle variations of input images. Several defenses have been proposed to improve the robustness against these variations. However, current defenses can only withstand the specific attack used in training, and the models often remain vulnerable to other inp…
▽ More
While deep neural networks have achieved remarkable success in various computer vision tasks, they often fail to generalize to new domains and subtle variations of input images. Several defenses have been proposed to improve the robustness against these variations. However, current defenses can only withstand the specific attack used in training, and the models often remain vulnerable to other input variations. Moreover, these methods often degrade performance of the model on clean images and do not generalize to out-of-domain samples. In this paper we present Generative Adversarial Training, an approach to simultaneously improve the model's generalization to the test set and out-of-domain samples as well as its robustness to unseen adversarial attacks. Instead of altering a low-level pre-defined aspect of images, we generate a spectrum of low-level, mid-level and high-level changes using generative models with a disentangled latent space. Adversarial training with these examples enable the model to withstand a wide range of attacks by observing a variety of input alterations during training. We show that our approach not only improves performance of the model on clean images and out-of-domain samples but also makes it robust against unforeseen attacks and outperforms prior work. We validate effectiveness of our method by demonstrating results on various tasks such as classification, segmentation and object detection.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Knowledge Graph Enhanced Event Extraction in Financial Documents
Authors:
Kaihao Guo,
Tianpei Jiang,
Haipeng Zhang
Abstract:
Event extraction is a classic task in natural language processing with wide use in handling large amount of yet rapidly growing financial, legal, medical, and government documents which often contain multiple events with their elements scattered and mixed across the documents, making the problem much more difficult. Though the underlying relations between event elements to be extracted provide hel…
▽ More
Event extraction is a classic task in natural language processing with wide use in handling large amount of yet rapidly growing financial, legal, medical, and government documents which often contain multiple events with their elements scattered and mixed across the documents, making the problem much more difficult. Though the underlying relations between event elements to be extracted provide helpful contextual information, they are somehow overlooked in prior studies. We showcase the enhancement to this task brought by utilizing the knowledge graph that captures entity relations and their attributes. We propose a first event extraction framework that embeds a knowledge graph through a Graph Neural Network and integrates the embedding with regular features, all at document-level. Specifically, for extracting events from Chinese financial announcements, our method outperforms the state-of-the-art method by 5.3% in F1-score.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Bipartite-ness under smooth conditions
Authors:
Tao Jiang,
Sean Longbrake,
Jie Ma
Abstract:
Given a family $\mathcal{F}$ of bipartite graphs, the {\it Zarankiewicz number} $z(m,n,\mathcal{F})$ is the maximum number of edges in an $m$ by $n$ bipartite graph $G$ that does not contain any member of $\mathcal{F}$ as a subgraph (such $G$ is called {\it $\mathcal{F}$-free}). For $1\leq β<α<2$, a family $\mathcal{F}$ of bipartite graphs is $(α,β)$-{\it smooth} if for some $ρ>0$ and every…
▽ More
Given a family $\mathcal{F}$ of bipartite graphs, the {\it Zarankiewicz number} $z(m,n,\mathcal{F})$ is the maximum number of edges in an $m$ by $n$ bipartite graph $G$ that does not contain any member of $\mathcal{F}$ as a subgraph (such $G$ is called {\it $\mathcal{F}$-free}). For $1\leq β<α<2$, a family $\mathcal{F}$ of bipartite graphs is $(α,β)$-{\it smooth} if for some $ρ>0$ and every $m\leq n$, $z(m,n,\mathcal{F})=ρm n^{α-1}+O(n^β)$. Motivated by their work on a conjecture of Erdős and Simonovits on compactness and a classic result of Andrásfai, Erdős and Sós, in \cite{AKSV} Allen, Keevash, Sudakov and Verstraëte proved that for any $(α,β)$-smooth family $\mathcal{F}$, there exists $k_0$ such that for all odd $k\geq k_0$ and sufficiently large $n$, any $n$-vertex $\mathcal{F}\cup\{C_k\}$-free graph with minimum degree at least $ρ(\frac{2n}{5}+o(n))^{α-1}$ is bipartite.
In this paper, we strengthen their result by showing that for every real $δ>0$, there exists $k_0$ such that for all odd $k\geq k_0$ and sufficiently large $n$, any $n$-vertex $\mathcal{F}\cup\{C_k\}$-free graph with minimum degree at least $δn^{α-1}$ is bipartite. Furthermore, our result holds under a more relaxed notion of smoothness, which include the families $\mathcal{F}$ consisting of the single graph $K_{s,t}$ when $t\gg s$. We also prove an analogous result for $C_{2\ell}$-free graphs for every $\ell\geq 2$, which complements a result of Keevash, Sudakov and Verstraëte in \cite{KSV}.
△ Less
Submitted 10 January, 2023; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Rainbow subdivisions of cliques
Authors:
Tao Jiang,
Shoham Letzter,
Abhishek Methuku,
Liana Yepremyan
Abstract:
We show that for every integer $m \ge 2$ and large $n$, every properly edge-coloured graph on $n$ vertices with at least $n (\log n)^{53}$ edges contains a rainbow subdivision of $K_m$. This is sharp up to a polylogarithmic factor. Our proof method exploits the connection between the mixing time of random walks and expansion in graphs.
We show that for every integer $m \ge 2$ and large $n$, every properly edge-coloured graph on $n$ vertices with at least $n (\log n)^{53}$ edges contains a rainbow subdivision of $K_m$. This is sharp up to a polylogarithmic factor. Our proof method exploits the connection between the mixing time of random walks and expansion in graphs.
△ Less
Submitted 1 September, 2023; v1 submitted 19 August, 2021;
originally announced August 2021.
-
Mean Test with Fewer Observation than Dimension and Ratio Unbiased Estimator for Correlation Matrix
Authors:
Tiefeng Jiang,
** Li
Abstract:
Hotelling's T-squared test is a classical tool to test if the normal mean of a multivariate normal distribution is a specified one or the means of two multivariate normal means are equal. When the population dimension is higher than the sample size, the test is no longer applicable. Under this situation, in this paper we revisit the tests proposed by Srivastava and Du (2008), who revise the Hotell…
▽ More
Hotelling's T-squared test is a classical tool to test if the normal mean of a multivariate normal distribution is a specified one or the means of two multivariate normal means are equal. When the population dimension is higher than the sample size, the test is no longer applicable. Under this situation, in this paper we revisit the tests proposed by Srivastava and Du (2008), who revise the Hotelling's statistics by replacing Wishart matrices with their diagonal matrices. They show the revised statistics are asymptotically normal. We use the random matrix theory to examine their statistics again and find that their discovery is just part of the big picture. In fact, we prove that their statistics, decided by the Euclidean norm of the population correlation matrix, can go to normal, mixing chi-squared distributions and a convolution of both. Examples are provided to show the phase transition phenomenon between the normal and mixing chi-squared distributions. The second contribution of ours is a rigorous derivation of an asymptotic ratio-unbiased-estimator of the squared Euclidean norm of the correlation matrix.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds
Authors:
Runsong Zhu,
Yuan Liu,
Zhen Dong,
Teng** Jiang,
Yuan Wang,
Wen** Wang,
Bisheng Yang
Abstract:
This paper presents a neural network for robust normal estimation on point clouds, named AdaFit, that can deal with point clouds with noise and density variations. Existing works use a network to learn point-wise weights for weighted least squares surface fitting to estimate the normals, which has difficulty in finding accurate normals in complex regions or containing noisy points. By analyzing th…
▽ More
This paper presents a neural network for robust normal estimation on point clouds, named AdaFit, that can deal with point clouds with noise and density variations. Existing works use a network to learn point-wise weights for weighted least squares surface fitting to estimate the normals, which has difficulty in finding accurate normals in complex regions or containing noisy points. By analyzing the step of weighted least squares surface fitting, we find that it is hard to determine the polynomial order of the fitting surface and the fitting surface is sensitive to outliers. To address these problems, we propose a simple yet effective solution that adds an additional offset prediction to improve the quality of normal estimation. Furthermore, in order to take advantage of points from different neighborhood sizes, a novel Cascaded Scale Aggregation layer is proposed to help the network predict more accurate point-wise offsets and weights. Extensive experiments demonstrate that AdaFit achieves state-of-the-art performance on both the synthetic PCPNet dataset and the real-word SceneNN dataset.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
Looking for the Signs: Identifying Isolated Sign Instances in Continuous Video Footage
Authors:
Tao Jiang,
Necati Cihan Camgoz,
Richard Bowden
Abstract:
In this paper, we focus on the task of one-shot sign spotting, i.e. given an example of an isolated sign (query), we want to identify whether/where this sign appears in a continuous, co-articulated sign language video (target). To achieve this goal, we propose a transformer-based network, called SignLookup. We employ 3D Convolutional Neural Networks (CNNs) to extract spatio-temporal representation…
▽ More
In this paper, we focus on the task of one-shot sign spotting, i.e. given an example of an isolated sign (query), we want to identify whether/where this sign appears in a continuous, co-articulated sign language video (target). To achieve this goal, we propose a transformer-based network, called SignLookup. We employ 3D Convolutional Neural Networks (CNNs) to extract spatio-temporal representations from video clips. To solve the temporal scale discrepancies between the query and the target videos, we construct multiple queries from a single video clip using different frame-level strides. Self-attention is applied across these query clips to simulate a continuous scale space. We also utilize another self-attention module on the target video to learn the contextual within the sequence. Finally a mutual-attention is used to match the temporal scales to localize the query within the target sequence. Extensive experiments demonstrate that the proposed approach can not only reliably identify isolated signs in continuous videos, regardless of the signers' appearance, but can also generalize to different sign languages. By taking advantage of the attention mechanism and the adaptive features, our model achieves state-of-the-art performance on the sign spotting task with accuracy as high as 96% on challenging benchmark datasets and significantly outperforming other approaches.
△ Less
Submitted 20 November, 2021; v1 submitted 21 July, 2021;
originally announced August 2021.
-
VLBI data processing on coronal radio-sounding experiments of Mars express
Authors:
Maoli Ma,
Guifré Molera Calvés,
Giuseppe Cimò,
Pei** Zhang,
Xiong Ming,
Peijia Li,
Pradyumna Kummamuru,
zhanghu Chu,
Tianyu Jiang,
Bo Xia,
Kondo Tetsuro,
Fengxian Tong,
Pablo de Vicente,
Jonathan Quick,
Hua Zhang,
Zhong Chen
Abstract:
The ESA's Mars Express solar corona experiments were performed at two solar conjunctions in the years 2015 and 2017 by a number of radio telescopes in the European VLBI Network. This paper presents the methods to measure the frequency and phase fluctuations of the spacecraft radio signal, and the applications to study the characteristics of the plasma turbulence effects on the signal at a single s…
▽ More
The ESA's Mars Express solar corona experiments were performed at two solar conjunctions in the years 2015 and 2017 by a number of radio telescopes in the European VLBI Network. This paper presents the methods to measure the frequency and phase fluctuations of the spacecraft radio signal, and the applications to study the characteristics of the plasma turbulence effects on the signal at a single station and at multiple stations via cross-correlation. The power spectra of the frequency fluctuations observed between 4.9 and 76.3 $\rm R_{s}$ have a power-law shape close to a Kolmogorov spectrum over the frequency interval $ ν_{lo}< ν<ν_{up}$, where the nominal value of $ν_{lo}$ is set to 3 mHz and $ν_{up}$ is in the range of 0.03 $\sim$ 0.15 Hz. The RMS of the frequency fluctuations is presented as a function of the heliocentric distance. Furthermore, we analyse the variations of the electron column density fluctuations at solar offsets 4.9 $\rm{R_{s}}$ and 9.9 $\rm{R_{s}}$ and the cross-correlation products between the VLBI stations. The power density of the differential fluctuations between different stations decreases at $ν< 0.01$ Hz. Finally, the fast flow speeds of solar wind $>700$ $\rm{km~s^{-1}}$ are derived from the cross-correlation of frequency fluctuations at $ν< 0.01$ Hz. The fast flow speeds of solar wind correspond to the high heliolatitude of the coronal region that the radio rays passed. The VLBI observations and analysis methods can be used to study the electron column density fluctuations and the turbulence at multiple spatial points in the inner solar wind by providing multiple lines of sight between the Earth and the spacecraft.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
LAConv: Local Adaptive Convolution for Image Fusion
Authors:
Zi-Rong **,
Liang-Jian Deng,
Tai-Xiang Jiang,
Tian-**g Zhang
Abstract:
The convolution operation is a powerful tool for feature extraction and plays a prominent role in the field of computer vision. However, when targeting the pixel-wise tasks like image fusion, it would not fully perceive the particularity of each pixel in the image if the uniform convolution kernel is used on different patches. In this paper, we propose a local adaptive convolution (LAConv), which…
▽ More
The convolution operation is a powerful tool for feature extraction and plays a prominent role in the field of computer vision. However, when targeting the pixel-wise tasks like image fusion, it would not fully perceive the particularity of each pixel in the image if the uniform convolution kernel is used on different patches. In this paper, we propose a local adaptive convolution (LAConv), which is dynamically adjusted to different spatial locations. LAConv enables the network to pay attention to every specific local area in the learning process. Besides, the dynamic bias (DYB) is introduced to provide more possibilities for the depiction of features and make the network more flexible. We further design a residual structure network equipped with the proposed LAConv and DYB modules, and apply it to two image fusion tasks. Experiments for pansharpening and hyperspectral image super-resolution (HISR) demonstrate the superiority of our method over other state-of-the-art methods. It is worth mentioning that LAConv can also be competent for other super-resolution tasks with less computation effort.
△ Less
Submitted 24 July, 2021;
originally announced July 2021.