Search | arXiv e-print repository

arXiv:2108.08378 [pdf, other]

Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility

Authors: Shuang Song, Zhaopeng Cui, Rongjun Qin

Abstract: We present a novel framework for mesh reconstruction from unstructured point clouds by taking advantage of the learned visibility of the 3D points in the virtual views and traditional graph-cut based mesh generation. Specifically, we first propose a three-step network that explicitly employs depth completion for visibility prediction. Then the visibility information of multiple views is aggregated… ▽ More We present a novel framework for mesh reconstruction from unstructured point clouds by taking advantage of the learned visibility of the 3D points in the virtual views and traditional graph-cut based mesh generation. Specifically, we first propose a three-step network that explicitly employs depth completion for visibility prediction. Then the visibility information of multiple views is aggregated to generate a 3D mesh model by solving an optimization problem considering visibility in which a novel adaptive visibility weighting in surface determination is also introduced to suppress line of sight with a large incident angle. Compared to other learning-based approaches, our pipeline only exercises the learning on a 2D binary classification task, \ie, points visible or not in a view, which is much more generalizable and practically more efficient and capable to deal with a large number of points. Experiments demonstrate that our method with favorable transferability and robustness, and achieve competing performances \wrt state-of-the-art learning-based approaches on small complex objects and outperforms on large indoor and outdoor scenes. Code is available at https://github.com/GDAOSU/vis2mesh. △ Less

Submitted 18 August, 2021; originally announced August 2021.

Comments: ICCV2021

arXiv:2108.08017 [pdf, other]

Deep Hybrid Self-Prior for Full 3D Mesh Generation

Authors: Xingkui Wei, Zhengqing Chen, Yanwei Fu, Zhaopeng Cui, Yinda Zhang

Abstract: We present a deep learning pipeline that leverages network self-prior to recover a full 3D model consisting of both a triangular mesh and a texture map from the colored 3D point cloud. Different from previous methods either exploiting 2D self-prior for image editing or 3D self-prior for pure surface reconstruction, we propose to exploit a novel hybrid 2D-3D self-prior in deep neural networks to si… ▽ More We present a deep learning pipeline that leverages network self-prior to recover a full 3D model consisting of both a triangular mesh and a texture map from the colored 3D point cloud. Different from previous methods either exploiting 2D self-prior for image editing or 3D self-prior for pure surface reconstruction, we propose to exploit a novel hybrid 2D-3D self-prior in deep neural networks to significantly improve the geometry quality and produce a high-resolution texture map, which is typically missing from the output of commodity-level 3D scanners. In particular, we first generate an initial mesh using a 3D convolutional neural network with 3D self-prior, and then encode both 3D information and color information in the 2D UV atlas, which is further refined by 2D convolutional neural networks with the self-prior. In this way, both 2D and 3D self-priors are utilized for the mesh and texture recovery. Experiments show that, without the need of any additional training data, our method recovers the 3D textured mesh model of high quality from sparse input, and outperforms the state-of-the-art methods in terms of both the geometry and texture quality. △ Less

Submitted 24 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: Accepted by ICCV2021

arXiv:2108.05690 [pdf, ps, other]

Going Deeper in Frequency Convolutional Neural Network: A Theoretical Perspective

Authors: Xiaohan Zhu, Zhen Cui, Tong Zhang, Yong Li, Jian Yang

Abstract: Convolutional neural network (CNN) is one of the most widely-used successful architectures in the era of deep learning. However, the high-computational cost of CNN still hampers more universal uses to light devices. Fortunately, the Fourier transform on convolution gives an elegant and promising solution to dramatically reduce the computation cost. Recently, some studies devote to such a challengi… ▽ More Convolutional neural network (CNN) is one of the most widely-used successful architectures in the era of deep learning. However, the high-computational cost of CNN still hampers more universal uses to light devices. Fortunately, the Fourier transform on convolution gives an elegant and promising solution to dramatically reduce the computation cost. Recently, some studies devote to such a challenging problem and pursue the complete frequency computation without any switching between spatial domain and frequent domain. In this work, we revisit the Fourier transform theory to derive feed-forward and back-propagation frequency operations of typical network modules such as convolution, activation and pooling. Due to the calculation limitation of complex numbers on most computation tools, we especially extend the Fourier transform to the Laplace transform for CNN, which can run in the real domain with more relaxed constraints. This work more focus on a theoretical extension and discussion about frequency CNN, and lay some theoretical ground for real application. △ Less

Submitted 12 August, 2021; originally announced August 2021.

arXiv:2108.05187 [pdf, other]

Discriminative Distillation to Reduce Class Confusion in Continual Learning

Authors: Changhong Zhong, Zhiying Cui, Ruixuan Wang, Wei-Shi Zheng

Abstract: Successful continual learning of new knowledge would enable intelligent systems to recognize more and more classes of objects. However, current intelligent systems often fail to correctly recognize previously learned classes of objects when updated to learn new classes. It is widely believed that such downgraded performance is solely due to the catastrophic forgetting of previously learned knowled… ▽ More Successful continual learning of new knowledge would enable intelligent systems to recognize more and more classes of objects. However, current intelligent systems often fail to correctly recognize previously learned classes of objects when updated to learn new classes. It is widely believed that such downgraded performance is solely due to the catastrophic forgetting of previously learned knowledge. In this study, we argue that the class confusion phenomena may also play a role in downgrading the classification performance during continual learning, i.e., the high similarity between new classes and any previously learned classes would also cause the classifier to make mistakes in recognizing these old classes, even if the knowledge of these old classes is not forgotten. To alleviate the class confusion issue, we propose a discriminative distillation strategy to help the classify well learn the discriminative features between confusing classes during continual learning. Experiments on multiple natural image classification tasks support that the proposed distillation strategy, when combined with existing methods, is effective in further improving continual learning. △ Less

Submitted 11 August, 2021; originally announced August 2021.

Comments: arXiv admin note: text overlap with arXiv:2104.13614

arXiv:2108.04983 [pdf, other]

Learning Fair Face Representation With Progressive Cross Transformer

Authors: Yong Li, Yufei Sun, Zhen Cui, Shiguang Shan, Jian Yang

Abstract: Face recognition (FR) has made extraordinary progress owing to the advancement of deep convolutional neural networks. However, demographic bias among different racial cohorts still challenges the practical face recognition system. The race factor has been proven to be a dilemma for fair FR (FFR) as the subject-related specific attributes induce the classification bias whilst carrying some useful c… ▽ More Face recognition (FR) has made extraordinary progress owing to the advancement of deep convolutional neural networks. However, demographic bias among different racial cohorts still challenges the practical face recognition system. The race factor has been proven to be a dilemma for fair FR (FFR) as the subject-related specific attributes induce the classification bias whilst carrying some useful cues for FR. To mitigate racial bias and meantime preserve robust FR, we abstract face identity-related representation as a signal denoising problem and propose a progressive cross transformer (PCT) method for fair face recognition. Originating from the signal decomposition theory, we attempt to decouple face representation into i) identity-related components and ii) noisy/identity-unrelated components induced by race. As an extension of signal subspace decomposition, we formulate face decoupling as a generalized functional expression model to cross-predict face identity and race information. The face expression model is further concretized by designing dual cross-transformers to distill identity-related components and suppress racial noises. In order to refine face representation, we take a progressive face decoupling way to learn identity/race-specific transformations, so that identity-unrelated components induced by race could be better disentangled. We evaluate the proposed PCT on the public fair face recognition benchmarks (BFW, RFW) and verify that PCT is capable of mitigating bias in face recognition while achieving state-of-the-art FR performance. Besides, visualization results also show that the attention maps in PCT can well reveal the race-related/biased facial regions. △ Less

Submitted 10 August, 2021; originally announced August 2021.

arXiv:2108.04948 [pdf, other]

doi 10.1016/j.physletb.2021.136631

Pion charge radius from pion+electron elastic scattering data

Authors: Zhu-Fang Cui, Daniele Binosi, Craig D. Roberts, Sebastian M. Schmidt

Abstract: With the aim of extracting the pion charge radius, we analyse extant precise pion+electron elastic scattering data on $Q^2 \in [0.015,0.144]\,$GeV$^2$ using a method based on interpolation via continued fractions augmented by statistical sampling. The scheme avoids any assumptions on the form of function used for the representation of data and subsequent extrapolation onto $Q^2\simeq 0$. Combining… ▽ More With the aim of extracting the pion charge radius, we analyse extant precise pion+electron elastic scattering data on $Q^2 \in [0.015,0.144]\,$GeV$^2$ using a method based on interpolation via continued fractions augmented by statistical sampling. The scheme avoids any assumptions on the form of function used for the representation of data and subsequent extrapolation onto $Q^2\simeq 0$. Combining results obtained from the two available data sets, we obtain $r_π= 0.640(7)\,$fm, a value $2.4\,σ$ below today's commonly quoted average. The tension may be relieved by collection and similar analysis of new precise data that densely cover a domain which reaches well below $Q^2 = 0.015\,$GeV$^2$. Considering available kaon+electron elastic scattering data sets, our analysis reveals that they contain insufficient information to extract an objective result for the charged-kaon radius, $r_K$. New data with much improved precision, low-$Q^2$ reach and coverage are necessary before a sound result for $r_K$ can be recorded. △ Less

Submitted 10 August, 2021; originally announced August 2021.

Comments: 5 pages, 4 figures

Report number: NJU-INP 047/21

Journal ref: Phys. Lett. B 822 (2021) 136631

arXiv:2108.02306 [pdf, other]

doi 10.1140/epja/s10050-021-00574-w

Dynamical diquarks in the ${\boldsymbol{γ^{(\ast)} p\to N(1535)\tfrac{1}{2}^-}}$ transition

Authors: Khépani Raya, L. X. Gutiérrez-Guerrero, Adnan Bashir, Lei Chang, Zhu-Fang Cui, Ya Lu, Craig D. Roberts, Jorge Segovia

Abstract: The $γ^{(\ast)}+p \to N(1535) \tfrac{1}{2}^-$ transition is studied using a symmetry-preserving regularisation of a vector$\,\otimes\,$vector contact interaction (SCI). The framework employs a Poincaré-covariant Faddeev equation to describe the initial and final state baryons as quark+di\-quark composites, wherein the diquark correlations are fully dynamical, interacting with the photon as allowed… ▽ More The $γ^{(\ast)}+p \to N(1535) \tfrac{1}{2}^-$ transition is studied using a symmetry-preserving regularisation of a vector$\,\otimes\,$vector contact interaction (SCI). The framework employs a Poincaré-covariant Faddeev equation to describe the initial and final state baryons as quark+di\-quark composites, wherein the diquark correlations are fully dynamical, interacting with the photon as allowed by their quantum numbers and continually engaging in breakup and recombination as required by the Faddeev kernel. The presence of such correlations owes largely to the mechanisms responsible for the emergence of hadron mass; and whereas the nucleon Faddeev amplitude is dominated by scalar and axial-vector diquark correlations, the amplitude of its parity partner, the $N(1535) \tfrac{1}{2}^-$, also contains sizeable pseudoscalar and vector diquark components. It is found that the $γ^{(\ast)}+p \to N(1535) \tfrac{1}{2}^-$ helicity amplitudes and related Dirac and Pauli form factors are keenly sensitive to the relative strengths of these diquark components in the baryon amplitudes, indicating that such resonance electrocouplings possess great sensitivity to baryon structural details. Whilst SCI analyses have their limitations, they also have the virtue of algebraic simplicity and a proven ability to reveal insights that can be used to inform more sophisticated studies in frameworks with closer ties to quantum chromodynamics. △ Less

Submitted 4 August, 2021; originally announced August 2021.

Comments: 16 pages, 14 figures, 4 tables

Report number: NJU-INP 046/21

Journal ref: Eur. Phys. J. A (2021) 57:266

arXiv:2108.00792 [pdf, other]

doi 10.3847/1538-4357/ac17eb

On the importance of wave planet interactions for the migration of two super-Earths embedded in a protoplanetary disk

Authors: Zijia Cui, John C. B. Papaloizou, Ewa Szuszkiewicz

Abstract: We investigate a repulsion mechanism between two low-mass planets migrating in a protoplanetary disk, for which the relative migration switches from convergent to divergent. This mechanism invokes density waves emitted by one planet transferring angular momentum to the coorbital region of the other and then directly to it through the horseshoe drag. We formulate simple analytical estimates, which… ▽ More We investigate a repulsion mechanism between two low-mass planets migrating in a protoplanetary disk, for which the relative migration switches from convergent to divergent. This mechanism invokes density waves emitted by one planet transferring angular momentum to the coorbital region of the other and then directly to it through the horseshoe drag. We formulate simple analytical estimates, which indicate when the repulsion mechanism is effective. One condition for a planet to be repelled is that it forms a partial gap in the disk and another is that this should contain enough material to support angular momentum exchange with it. Using two-dimensional hydrodynamical simulations we obtain divergent migration of two super-Earths embedded in a protoplanetary disk because of repulsion between them and verify these conditions. To investigate the importance of resonant interaction we study the migration of planet pairs near first-order commensurabilities. It appears that proximity to resonance is significant but not essential. In this context we find repulsion still occurs when the gravitational interaction between the planets is removed sugesting the importance of angular momentum transfer through waves excited by another planet. This may occur through the scattering of coorbital material (the horseshoe drag), or material orbiting close by. Our results indicate that if conditions favor the repulsion between two planets described above, we expect to observe planet pairs with their period ratios greater, often only slightly greater, than resonant values or possibly rarity of commensurability. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: 35 pages, 31 figures, to be published in ApJ

arXiv:2107.12762 [pdf, other]

Multi-Scale Local-Temporal Similarity Fusion for Continuous Sign Language Recognition

Authors: Pan Xie, Zhi Cui, Yao Du, Mengyi Zhao, Jianwei Cui, Bin Wang, Xiaohui Hu

Abstract: Continuous sign language recognition (cSLR) is a public significant task that transcribes a sign language video into an ordered gloss sequence. It is important to capture the fine-grained gloss-level details, since there is no explicit alignment between sign video frames and the corresponding glosses. Among the past works, one promising way is to adopt a one-dimensional convolutional network (1D-C… ▽ More Continuous sign language recognition (cSLR) is a public significant task that transcribes a sign language video into an ordered gloss sequence. It is important to capture the fine-grained gloss-level details, since there is no explicit alignment between sign video frames and the corresponding glosses. Among the past works, one promising way is to adopt a one-dimensional convolutional network (1D-CNN) to temporally fuse the sequential frames. However, CNNs are agnostic to similarity or dissimilarity, and thus are unable to capture local consistent semantics within temporally neighboring frames. To address the issue, we propose to adaptively fuse local features via temporal similarity for this task. Specifically, we devise a Multi-scale Local-Temporal Similarity Fusion Network (mLTSF-Net) as follows: 1) In terms of a specific video frame, we firstly select its similar neighbours with multi-scale receptive regions to accommodate different lengths of glosses. 2) To ensure temporal consistency, we then use position-aware convolution to temporally convolve each scale of selected frames. 3) To obtain a local-temporally enhanced frame-wise representation, we finally fuse the results of different scales using a content-dependent aggregator. We train our model in an end-to-end fashion, and the experimental results on RWTH-PHOENIX-Weather 2014 datasets (RWTH) demonstrate that our model achieves competitive performance compared with several state-of-the-art models. △ Less

Submitted 27 July, 2021; originally announced July 2021.

arXiv:2107.11762 [pdf]

DR2L: Surfacing Corner Cases to Robustify Autonomous Driving via Domain Randomization Reinforcement Learning

Authors: Haoyi Niu, Jianming Hu, Zheyu Cui, Yi Zhang

Abstract: How to explore corner cases as efficiently and thoroughly as possible has long been one of the top concerns in the context of deep reinforcement learning (DeepRL) autonomous driving. Training with simulated data is less costly and dangerous than utilizing real-world data, but the inconsistency of parameter distribution and the incorrect system modeling in simulators always lead to an inevitable Si… ▽ More How to explore corner cases as efficiently and thoroughly as possible has long been one of the top concerns in the context of deep reinforcement learning (DeepRL) autonomous driving. Training with simulated data is less costly and dangerous than utilizing real-world data, but the inconsistency of parameter distribution and the incorrect system modeling in simulators always lead to an inevitable Sim2real gap, which probably accounts for the underperformance in novel, anomalous and risky cases that simulators can hardly generate. Domain Randomization(DR) is a methodology that can bridge this gap with little or no real-world data. Consequently, in this research, an adversarial model is put forward to robustify DeepRL-based autonomous vehicles trained in simulation to gradually surfacing harder events, so that the models could readily transfer to the real world. △ Less

Submitted 25 July, 2021; originally announced July 2021.

Comments: 8 pages, 7 figures

arXiv:2107.09899 [pdf, other]

Structure-Aware Long Short-Term Memory Network for 3D Cephalometric Landmark Detection

Authors: Runnan Chen, Yuexin Ma, Nenglun Chen, Lingjie Liu, Zhiming Cui, Yanhong Lin, Wen** Wang

Abstract: Detecting 3D landmarks on cone-beam computed tomography (CBCT) is crucial to assessing and quantifying the anatomical abnormalities in 3D cephalometric analysis. However, the current methods are time-consuming and suffer from large biases in landmark localization, leading to unreliable diagnosis results. In this work, we propose a novel Structure-Aware Long Short-Term Memory framework (SA-LSTM) fo… ▽ More Detecting 3D landmarks on cone-beam computed tomography (CBCT) is crucial to assessing and quantifying the anatomical abnormalities in 3D cephalometric analysis. However, the current methods are time-consuming and suffer from large biases in landmark localization, leading to unreliable diagnosis results. In this work, we propose a novel Structure-Aware Long Short-Term Memory framework (SA-LSTM) for efficient and accurate 3D landmark detection. To reduce the computational burden, SA-LSTM is designed in two stages. It first locates the coarse landmarks via heatmap regression on a down-sampled CBCT volume and then progressively refines landmarks by attentive offset regression using multi-resolution cropped patches. To boost accuracy, SA-LSTM captures global-local dependence among the crop** patches via self-attention. Specifically, a novel graph attention module implicitly encodes the landmark's global structure to rationalize the predicted position. Moreover, a novel attention-gated module recursively filters irrelevant local features and maintains high-confident local predictions for aggregating the final result. Experiments conducted on an in-house dataset and a public dataset show that our method outperforms state-of-the-art methods, achieving 1.64 mm and 2.37 mm average errors, respectively. Furthermore, our method is very efficient, taking only 0.5 seconds for inferring the whole CBCT volume of resolution 768$\times$768$\times$576. △ Less

Submitted 18 February, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: IEEE Transactions on medical images

arXiv:2107.06532 [pdf, other]

doi 10.1109/TIP.2022.3177952

Graph Jigsaw Learning for Cartoon Face Recognition

Authors: Yong Li, Lingjie Lao, Zhen Cui, Shiguang Shan, Jian Yang

Abstract: Cartoon face recognition is challenging as they typically have smooth color regions and emphasized edges, the key to recognize cartoon faces is to precisely perceive their sparse and critical shape patterns. However, it is quite difficult to learn a shape-oriented representation for cartoon face recognition with convolutional neural networks (CNNs). To mitigate this issue, we propose the GraphJigs… ▽ More Cartoon face recognition is challenging as they typically have smooth color regions and emphasized edges, the key to recognize cartoon faces is to precisely perceive their sparse and critical shape patterns. However, it is quite difficult to learn a shape-oriented representation for cartoon face recognition with convolutional neural networks (CNNs). To mitigate this issue, we propose the GraphJigsaw that constructs jigsaw puzzles at various stages in the classification network and solves the puzzles with the graph convolutional network (GCN) in a progressive manner. Solving the puzzles requires the model to spot the shape patterns of the cartoon faces as the texture information is quite limited. The key idea of GraphJigsaw is constructing a jigsaw puzzle by randomly shuffling the intermediate convolutional feature maps in the spatial dimension and exploiting the GCN to reason and recover the correct layout of the jigsaw fragments in a self-supervised manner. The proposed GraphJigsaw avoids training the classification model with the deconstructed images that would introduce noisy patterns and are harmful for the final classification. Specially, GraphJigsaw can be incorporated at various stages in a top-down manner within the classification model, which facilitates propagating the learned shape patterns gradually. GraphJigsaw does not rely on any extra manual annotation during the training process and incorporates no extra computation burden at inference time. Both quantitative and qualitative experimental results have verified the feasibility of our proposed GraphJigsaw, which consistently outperforms other face recognition or jigsaw-based methods on two popular cartoon face datasets with considerable improvements. △ Less

Submitted 14 July, 2021; originally announced July 2021.

arXiv:2107.03644 [pdf, other]

ComFormer: Code Comment Generation via Transformer and Fusion Method-based Hybrid Code Representation

Authors: Guang Yang, Xiang Chen, **xin Cao, Shuyuan Xu, Zhanqi Cui, Chi Yu, Ke Liu

Abstract: Developers often write low-quality code comments due to the lack of programming experience, which can reduce the efficiency of developers program comprehension. Therefore, developers hope that code comment generation tools can be developed to illustrate the functionality and purpose of the code. Recently, researchers mainly model this problem as the neural machine translation problem and tend to u… ▽ More Developers often write low-quality code comments due to the lack of programming experience, which can reduce the efficiency of developers program comprehension. Therefore, developers hope that code comment generation tools can be developed to illustrate the functionality and purpose of the code. Recently, researchers mainly model this problem as the neural machine translation problem and tend to use deep learning-based methods. In this study, we propose a novel method ComFormer based on Transformer and fusion method-based hybrid code presentation. Moreover, to alleviate OOV (out-of-vocabulary) problem and speed up model training, we further utilize the Byte-BPE algorithm to split identifiers and Sim_SBT method to perform AST Traversal. We compare ComFormer with seven state-of-the-art baselines from code comment generation and neural machine translation domains. Comparison results show the competitiveness of ComFormer in terms of three performance measures. Moreover, we perform a human study to verify that ComFormer can generate high-quality comments. △ Less

Submitted 8 July, 2021; originally announced July 2021.

Comments: DSA2021

arXiv:2107.03488 [pdf, other]

doi 10.1140/epjc/s10052-021-09673-w

Vector-meson production and vector meson dominance

Authors: Yin-Zhen Xu, Si-Yang Chen, Zhao-Qian Yao, Daniele Binosi, Zhu-Fang Cui, Craig D. Roberts

Abstract: We consider the fidelity of the vector meson dominance (VMD) assumption as an instrument for relating the electromagnetic vector-meson production reaction $e + p \to e^\prime + V + p$ to the purely hadronic process $V + p \to V+p$. Analyses of the photon vacuum polarisation and the photon-quark vertex reveal that such a VMD Ansatz might be reasonable for light vector-mesons. However, when the vect… ▽ More We consider the fidelity of the vector meson dominance (VMD) assumption as an instrument for relating the electromagnetic vector-meson production reaction $e + p \to e^\prime + V + p$ to the purely hadronic process $V + p \to V+p$. Analyses of the photon vacuum polarisation and the photon-quark vertex reveal that such a VMD Ansatz might be reasonable for light vector-mesons. However, when the vector-mesons are described by momentum-dependent bound-state amplitudes, VMD fails for heavy vector-mesons: it cannot be used reliably to estimate either a photon-to-vector-meson transition strength or the momentum dependence of those integrands that would arise in calculations of the different reaction amplitudes. Consequently, for processes involving heavy mesons, the veracity of both cross-section estimates and conclusions based on the VMD assumption should be reviewed, e.g., those relating to hidden-charm pentaquark production and the origin of the proton mass. △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: 12 pages, 2 figures, 2 tables

Report number: NJU-INP 044/21

arXiv:2106.07969 [pdf, ps, other]

doi 10.1088/1674-4527/21/10/246

The low cosmic-ray density in Polaris Flare

Authors: Zhi-wei Cui, Rui-zhi Yang, Bing Liu

Abstract: We reported the gamma-ray observation towards the giant molecular cloud Polaris Flare. Together with the dust column density map, we derived the cosmic ray density and spectrum in this cloud. Compared with the CR measured locally, the CR density in Polaris Flare is significantly lower and the spectrum is softer. Such a different CR spectrum reveals either a rather large gradient of CR distribution… ▽ More We reported the gamma-ray observation towards the giant molecular cloud Polaris Flare. Together with the dust column density map, we derived the cosmic ray density and spectrum in this cloud. Compared with the CR measured locally, the CR density in Polaris Flare is significantly lower and the spectrum is softer. Such a different CR spectrum reveals either a rather large gradient of CR distribution in the direction perpendicular to the Galactic plane or a suppression of CR inside molecular clouds. △ Less

Submitted 15 June, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: 7 pages, 5 figures, accepted for publication in Research in Astronomy and Astrophysics (RAA)

arXiv:2106.06188 [pdf, ps, other]

Precise large deviations of sums of widely dependent random variables and its applications

Authors: Zhaolei Cui, Yuebao Wang

Abstract: In this paper, we obtain some results on precise large deviations for non-random and random sums of widely dependent random variables with common dominatedly varying tail distribution or consistently varying tail distribution on $(-\infty,\infty)$. Then we apply the results to reinsurance and insurance and give some asymptotic estimates on proportional reinsurance, random-time ruin probability and… ▽ More In this paper, we obtain some results on precise large deviations for non-random and random sums of widely dependent random variables with common dominatedly varying tail distribution or consistently varying tail distribution on $(-\infty,\infty)$. Then we apply the results to reinsurance and insurance and give some asymptotic estimates on proportional reinsurance, random-time ruin probability and the finite-time ruin probability. △ Less

Submitted 11 June, 2021; originally announced June 2021.

arXiv:2106.05519 [pdf, other]

Consistent Instance False Positive Improves Fairness in Face Recognition

Authors: Xingkun Xu, Yuge Huang, Pengcheng Shen, Shaoxin Li, Jilin Li, Feiyue Huang, Yong Li, Zhen Cui

Abstract: Demographic bias is a significant challenge in practical face recognition systems. Existing methods heavily rely on accurate demographic annotations. However, such annotations are usually unavailable in real scenarios. Moreover, these methods are typically designed for a specific demographic group and are not general enough. In this paper, we propose a false positive rate penalty loss, which mitig… ▽ More Demographic bias is a significant challenge in practical face recognition systems. Existing methods heavily rely on accurate demographic annotations. However, such annotations are usually unavailable in real scenarios. Moreover, these methods are typically designed for a specific demographic group and are not general enough. In this paper, we propose a false positive rate penalty loss, which mitigates face recognition bias by increasing the consistency of instance False Positive Rate (FPR). Specifically, we first define the instance FPR as the ratio between the number of the non-target similarities above a unified threshold and the total number of the non-target similarities. The unified threshold is estimated for a given total FPR. Then, an additional penalty term, which is in proportion to the ratio of instance FPR overall FPR, is introduced into the denominator of the softmax-based loss. The larger the instance FPR, the larger the penalty. By such unequal penalties, the instance FPRs are supposed to be consistent. Compared with the previous debiasing methods, our method requires no demographic annotations. Thus, it can mitigate the bias among demographic groups divided by various attributes, and these attributes are not needed to be previously predefined during training. Extensive experimental results on popular benchmarks demonstrate the superiority of our method over state-of-the-art competitors. Code and trained models are available at https://github.com/Tencent/TFace. △ Less

Submitted 10 June, 2021; originally announced June 2021.

Comments: CVPR2021

arXiv:2105.13717 [pdf, other]

doi 10.1109/GLOBECOM46510.2021.9685078

Coverage Analysis of Cellular-Connected UAV Communications with 3GPP Antenna and Channel Models

Authors: Zhuangzhuang Cui, Ke Guan, İsmail Güvenç, Claude Oestges, Zhangdui Zhong

Abstract: For reliable and efficient communications of aerial platforms, such as unmanned aerial vehicles (UAVs), the cellular network is envisioned to provide connectivity for the aerial and ground user equipment (GUE) simultaneously, which brings challenges to the existing pattern of the base station (BS) tailored for ground-level services. Thus, we focus on the coverage probability analysis to investigat… ▽ More For reliable and efficient communications of aerial platforms, such as unmanned aerial vehicles (UAVs), the cellular network is envisioned to provide connectivity for the aerial and ground user equipment (GUE) simultaneously, which brings challenges to the existing pattern of the base station (BS) tailored for ground-level services. Thus, we focus on the coverage probability analysis to investigate the coexistence of aerial and terrestrial users, by employing realistic antenna and channel models reported in the 3rd Generation Partnership Project (3GPP). The homogeneous Poisson point process (PPP) is used to describe the BS distribution, and the BS antenna is adjustable in the down-tilted angle and the number of the antenna array. Meantime, omnidirectional antennas are used for cellular users. We first derive the approximation of coverage probability and then conduct numerous simulations to evaluate the impacts of antenna numbers, down-tilted angles, carrier frequencies, and user heights. One of the essential findings indicates that the coverage probabilities of high-altitude users become less sensitive to the down-tilted angle. Moreover, we found that the aerial user equipment (AUE) in a certain range of heights can achieve the same or better coverage probability than that of GUE, which provides an insight into the effective deployment of cellular-connected aerial communications. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2105.13593 [pdf, other]

Semi-supervised Anatomical Landmark Detection via Shape-regulated Self-training

Authors: Runnan Chen, Yuexin Ma, Lingjie Liu, Nenglun Chen, Zhiming Cui, Guodong Wei, Wen** Wang

Abstract: Well-annotated medical images are costly and sometimes even impossible to acquire, hindering landmark detection accuracy to some extent. Semi-supervised learning alleviates the reliance on large-scale annotated data by exploiting the unlabeled data to understand the population structure of anatomical landmarks. The global shape constraint is the inherent property of anatomical landmarks that provi… ▽ More Well-annotated medical images are costly and sometimes even impossible to acquire, hindering landmark detection accuracy to some extent. Semi-supervised learning alleviates the reliance on large-scale annotated data by exploiting the unlabeled data to understand the population structure of anatomical landmarks. The global shape constraint is the inherent property of anatomical landmarks that provides valuable guidance for more consistent pseudo labelling of the unlabeled data, which is ignored in the previously semi-supervised methods. In this paper, we propose a model-agnostic shape-regulated self-training framework for semi-supervised landmark detection by fully considering the global shape constraint. Specifically, to ensure pseudo labels are reliable and consistent, a PCA-based shape model adjusts pseudo labels and eliminate abnormal ones. A novel Region Attention loss to make the network automatically focus on the structure consistent regions around pseudo labels. Extensive experiments show that our approach outperforms other semi-supervised methods and achieves notable improvement on three medical image datasets. Moreover, our framework is flexible and can be used as a plug-and-play module integrated into most supervised methods to improve performance further. △ Less

Submitted 27 November, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

Comments: Accepted to Neurocomputing

arXiv:2105.11866 [pdf, other]

GraphFM: Graph Factorization Machines for Feature Interaction Modeling

Authors: Shu Wu, Zekun Li, Yunyue Su, Zeyu Cui, Xiaoyu Zhang, Liang Wang

Abstract: Factorization machine (FM) is a prevalent approach to modeling pairwise (second-order) feature interactions when dealing with high-dimensional sparse data. However, on the one hand, FM fails to capture higher-order feature interactions suffering from combinatorial expansion. On the other hand, taking into account interactions between every pair of features may introduce noise and degrade predictio… ▽ More Factorization machine (FM) is a prevalent approach to modeling pairwise (second-order) feature interactions when dealing with high-dimensional sparse data. However, on the one hand, FM fails to capture higher-order feature interactions suffering from combinatorial expansion. On the other hand, taking into account interactions between every pair of features may introduce noise and degrade prediction accuracy. To solve the problems, we propose a novel approach, Graph Factorization Machine (GraphFM), by naturally representing features in the graph structure. In particular, we design a mechanism to select the beneficial feature interactions and formulate them as edges between features. Then the proposed model, which integrates the interaction function of FM into the feature aggregation strategy of Graph Neural Network (GNN), can model arbitrary-order feature interactions on the graph-structured features by stacking layers. Experimental results on several real-world datasets have demonstrated the rationality and effectiveness of our proposed approach. The code and data are available at \href{https://github.com/CRIPAC-DIG/GraphCTR}{https://github.com/CRIPAC-DIG/GraphCTR}. △ Less

Submitted 31 March, 2024; v1 submitted 25 May, 2021; originally announced May 2021.

Comments: The code and data are available at https://github.com/CRIPAC-DIG/GraphCTR

arXiv:2105.01489 [pdf, other]

doi 10.1103/PhysRevLett.127.147202

Efficient Method for Prediction of Meta-stable/Ground Multipolar Ordered States and its Application in Monolayer $α$-\ce{RuX3} (X=Cl,I)

Authors: Wen-Xuan Qiu, **-Yu Zou, Ai-Yun Luo, Zhi-Hai Cui, Zhi-Da Song, **-Hua Gao, Yi-Lin Wang, Gang Xu

Abstract: Exotic high-rank multipolar order parameters have been found to be unexpectedly active in more and more correlated materials in recent years. Such multipoles are usually dubbed as "Hidden Orders" since they are insensitive to common experimental probes. Theoretically, it is also difficult to predict multipolar orders via \textit{ab initio} calculations in real materials. Here, we present an effici… ▽ More Exotic high-rank multipolar order parameters have been found to be unexpectedly active in more and more correlated materials in recent years. Such multipoles are usually dubbed as "Hidden Orders" since they are insensitive to common experimental probes. Theoretically, it is also difficult to predict multipolar orders via \textit{ab initio} calculations in real materials. Here, we present an efficient method to predict possible multipoles in materials based on linear response theory under random phase approximation. Using this method, we successfully predict two pure meta-stable magnetic octupolar states in monolayer $α$-\ce{RuCl3}, which is confirmed by self-consistent unrestricted Hartree-Fock calculations. We then demonstrate that these octupolar states can be stabilized in monolayer $α$-\ce{RuI3}, one of which becomes the octupolar ground state. Furthermore, we also predict a fingerprint of orthogonal magnetization pattern produced by the octupole moment, which can be easily detected by experiment. The method and the example presented in this work serve as a guidance for searching multipolar order parameters in other correlated materials. △ Less

Submitted 6 September, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

Comments: 15 pages,8 figures

Journal ref: Phys. Rev. Lett. 127, 147202 (2021)

arXiv:2104.12483 [pdf, other]

Represent Items by Items: An Enhanced Representation of the Target Item for Recommendation

Authors: Yinjiang Cai, Zeyu Cui, Shu Wu, Zhen Lei, Xibo Ma

Abstract: Item-based collaborative filtering (ICF) has been widely used in industrial applications such as recommender system and online advertising. It models users' preference on target items by the items they have interacted with. Recent models use methods such as attention mechanism and deep neural network to learn the user representation and scoring function more accurately. However, despite their effe… ▽ More Item-based collaborative filtering (ICF) has been widely used in industrial applications such as recommender system and online advertising. It models users' preference on target items by the items they have interacted with. Recent models use methods such as attention mechanism and deep neural network to learn the user representation and scoring function more accurately. However, despite their effectiveness, such models still overlook a problem that performance of ICF methods heavily depends on the quality of item representation especially the target item representation. In fact, due to the long-tail distribution in the recommendation, most item embeddings can not represent the semantics of items accurately and thus degrade the performance of current ICF methods. In this paper, we propose an enhanced representation of the target item which distills relevant information from the co-occurrence items. We design sampling strategies to sample fix number of co-occurrence items for the sake of noise reduction and computational cost. Considering the different importance of sampled items to the target item, we apply attention mechanism to selectively adopt the semantic information of the sampled items. Our proposed Co-occurrence based Enhanced Representation model (CER) learns the scoring function by a deep neural network with the attentive user representation and fusion of raw representation and enhanced representation of target item as input. With the enhanced representation, CER has stronger representation power for the tail items compared to the state-of-the-art ICF methods. Extensive experiments on two public benchmarks demonstrate the effectiveness of CER. △ Less

Submitted 26 April, 2021; originally announced April 2021.

Comments: 10 pages,7 figures

arXiv:2104.10261 [pdf, other]

doi 10.1016/j.physletb.2021.136344

Semileptonic $B_c \to η_c, J/ψ$ transitions

Authors: Zhao-Qian Yao, Daniele Binosi, Zhu-Fang Cui, Craig D. Roberts

Abstract: Using a systematic, symmetry-preserving continuum approach to the Standard Model strong-interaction bound-state problem, we deliver parameter-free predictions for all semileptonic $B_c \to η_c, J/ψ$ transition form factors on the complete domains of empirically accessible momentum transfers. Working with branching fractions calculated therefrom, the following values of the ratios for $τ$ over $μ$… ▽ More Using a systematic, symmetry-preserving continuum approach to the Standard Model strong-interaction bound-state problem, we deliver parameter-free predictions for all semileptonic $B_c \to η_c, J/ψ$ transition form factors on the complete domains of empirically accessible momentum transfers. Working with branching fractions calculated therefrom, the following values of the ratios for $τ$ over $μ$ final states are obtained: $R_{η_c}=0.313(22)$ and $R_{J/ψ}=0.242(47)$. Combined with other recent results, our analysis confirms a $2σ$ discrepancy between the Standard Model prediction for $R_{J/ψ}$ and the single available experimental result. △ Less

Submitted 21 April, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: 8 pages, 4 figures, 5 tables

Report number: NJU-INP 041/21

arXiv:2104.06813 [pdf, other]

doi 10.1145/3394171.3416277

Global Information Guided Video Anomaly Detection

Authors: Hui Lv, Chunyan Xu, Zhen Cui

Abstract: Video anomaly detection (VAD) is currently a challenging task due to the complexity of anomaly as well as the lack of labor-intensive temporal annotations. In this paper, we propose an end-to-end Global Information Guided (GIG) anomaly detection framework for anomaly detection using the video-level annotations (i.e., weak labels). We propose to first mine the global pattern cues by leveraging the… ▽ More Video anomaly detection (VAD) is currently a challenging task due to the complexity of anomaly as well as the lack of labor-intensive temporal annotations. In this paper, we propose an end-to-end Global Information Guided (GIG) anomaly detection framework for anomaly detection using the video-level annotations (i.e., weak labels). We propose to first mine the global pattern cues by leveraging the weak labels in a GIG module. Then we build a spatial reasoning module to measure the relevance between vectors in spatial domain with the global cue vectors, and select the most related feature vectors for temporal anomaly detection. The experimental results on the CityScene challenge demonstrate the effectiveness of our model. △ Less

Submitted 14 April, 2021; originally announced April 2021.

arXiv:2104.06689 [pdf, other]

Learning Normal Dynamics in Videos with Meta Prototype Network

Authors: Hui Lv, Chen Chen, Zhen Cui, Chunyan Xu, Yong Li, Jian Yang

Abstract: Frame reconstruction (current or future frame) based on Auto-Encoder (AE) is a popular method for video anomaly detection. With models trained on the normal data, the reconstruction errors of anomalous scenes are usually much larger than those of normal ones. Previous methods introduced the memory bank into AE, for encoding diverse normal patterns across the training videos. However, they are memo… ▽ More Frame reconstruction (current or future frame) based on Auto-Encoder (AE) is a popular method for video anomaly detection. With models trained on the normal data, the reconstruction errors of anomalous scenes are usually much larger than those of normal ones. Previous methods introduced the memory bank into AE, for encoding diverse normal patterns across the training videos. However, they are memory-consuming and cannot cope with unseen new scenarios in the testing data. In this work, we propose a dynamic prototype unit (DPU) to encode the normal dynamics as prototypes in real time, free from extra memory cost. In addition, we introduce meta-learning to our DPU to form a novel few-shot normalcy learner, namely Meta-Prototype Unit (MPU). It enables the fast adaption capability on new scenes by only consuming a few iterations of update. Extensive experiments are conducted on various benchmarks. The superior performance over the state-of-the-art demonstrates the effectiveness of our method. △ Less

Submitted 10 May, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: 9 pages, 4 figures, 6 tables

arXiv:2104.05901 [pdf, other]

SRR-Net: A Super-Resolution-Involved Reconstruction Method for High Resolution MR Imaging

Authors: Wenqi Huang, Sen Jia, Ziwen Ke, Zhuo-Xu Cui, **g Cheng, Yanjie Zhu, Dong Liang

Abstract: Improving the image resolution and acquisition speed of magnetic resonance imaging (MRI) is a challenging problem. There are mainly two strategies dealing with the speed-resolution trade-off: (1) $k$-space undersampling with high-resolution acquisition, and (2) a pipeline of lower resolution image reconstruction and image super-resolution. However, these approaches either have limited performance… ▽ More Improving the image resolution and acquisition speed of magnetic resonance imaging (MRI) is a challenging problem. There are mainly two strategies dealing with the speed-resolution trade-off: (1) $k$-space undersampling with high-resolution acquisition, and (2) a pipeline of lower resolution image reconstruction and image super-resolution. However, these approaches either have limited performance at certain high acceleration factor or suffer from the error accumulation of two-step structure. In this paper, we combine the idea of MR reconstruction and image super-resolution, and work on recovering HR images from low-resolution under-sampled $k$-space data directly. Particularly, the SR-involved reconstruction can be formulated as a variational problem, and a learnable network unrolled from its solution algorithm is proposed. A discriminator was introduced to enhance the detail refining performance. Experiment results using in-vivo HR multi-coil brain data indicate that the proposed SRR-Net is capable of recovering high-resolution brain images with both good visual quality and perceptual quality. △ Less

Submitted 12 April, 2021; originally announced April 2021.

arXiv:2104.05706 [pdf, other]

Towards Efficient Graph Convolutional Networks for Point Cloud Handling

Authors: Yawei Li, He Chen, Zhaopeng Cui, Radu Timofte, Marc Pollefeys, Gregory Chirikjian, Luc Van Gool

Abstract: In this paper, we aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds. The basic graph convolution that is typically composed of a $K$-nearest neighbor (KNN) search and a multilayer perceptron (MLP) is examined. By mathematically analyzing the operations there, two findings to improve the efficiency of GCNs are obtained. (1) The local g… ▽ More In this paper, we aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds. The basic graph convolution that is typically composed of a $K$-nearest neighbor (KNN) search and a multilayer perceptron (MLP) is examined. By mathematically analyzing the operations there, two findings to improve the efficiency of GCNs are obtained. (1) The local geometric structure information of 3D representations propagates smoothly across the GCN that relies on KNN search to gather neighborhood features. This motivates the simplification of multiple KNN searches in GCNs. (2) Shuffling the order of graph feature gathering and an MLP leads to equivalent or similar composite operations. Based on those findings, we optimize the computational procedure in GCNs. A series of experiments show that the optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed while maintaining comparable accuracy for learning on point clouds. Code will be available at \url{https://github.com/ofsoundof/EfficientGCN.git}. △ Less

Submitted 12 April, 2021; originally announced April 2021.

arXiv:2104.03493 [pdf, other]

Riggable 3D Face Reconstruction via In-Network Optimization

Authors: Ziqian Bai, Zhaopeng Cui, Xiaoming Liu, ** Tan

Abstract: This paper presents a method for riggable 3D face reconstruction from monocular images, which jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations. To achieve this goal, we design an end-to-end trainable network embedded with a differentiable in-network optimization. The network first parameterizes the face rig as a compact latent code… ▽ More This paper presents a method for riggable 3D face reconstruction from monocular images, which jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations. To achieve this goal, we design an end-to-end trainable network embedded with a differentiable in-network optimization. The network first parameterizes the face rig as a compact latent code with a neural decoder, and then estimates the latent code as well as per-image parameters via a learnable optimization. By estimating a personalized face rig, our method goes beyond static reconstructions and enables downstream applications such as video retargeting. In-network optimization explicitly enforces constraints derived from the first principles, thus introduces additional priors than regression-based methods. Finally, data-driven priors from deep learning are utilized to constrain the ill-posed monocular setting and ease the optimization difficulty. Experiments demonstrate that our method achieves SOTA reconstruction accuracy, reasonable robustness and generalization ability, and supports standard face rig applications. △ Less

Submitted 7 April, 2021; originally announced April 2021.

Comments: CVPR2021. Code: https://github.com/zqbai-jeremy/INORig Camera Ready Paper: https://zqbai-jeremy.github.io/files/INORig.pdf Camera Ready Supp: https://zqbai-jeremy.github.io/files/INORig_supp.pdf

arXiv:2104.02962 [pdf, other]

DyGCN: Dynamic Graph Embedding with Graph Convolutional Network

Authors: Zeyu Cui, Zekun Li, Shu Wu, Xiaoyu Zhang, Qiang Liu, Liang Wang, Mengmeng Ai

Abstract: Graph embedding, aiming to learn low-dimensional representations (aka. embeddings) of nodes, has received significant attention recently. Recent years have witnessed a surge of efforts made on static graphs, among which Graph Convolutional Network (GCN) has emerged as an effective class of models. However, these methods mainly focus on the static graph embedding. In this work, we propose an effici… ▽ More Graph embedding, aiming to learn low-dimensional representations (aka. embeddings) of nodes, has received significant attention recently. Recent years have witnessed a surge of efforts made on static graphs, among which Graph Convolutional Network (GCN) has emerged as an effective class of models. However, these methods mainly focus on the static graph embedding. In this work, we propose an efficient dynamic graph embedding approach, Dynamic Graph Convolutional Network (DyGCN), which is an extension of GCN-based methods. We naturally generalizes the embedding propagation scheme of GCN to dynamic setting in an efficient manner, which is to propagate the change along the graph to update node embeddings. The most affected nodes are first updated, and then their changes are propagated to the further nodes and leads to their update. Extensive experiments conducted on various dynamic graphs demonstrate that our model can update the node embeddings in a time-saving and performance-preserving way. △ Less

Submitted 7 April, 2021; originally announced April 2021.

Comments: 21 pages, 5 figures, submitted to TOIS

arXiv:2104.01102 [pdf, other]

Deep Manifold Learning for Dynamic MR Imaging

Authors: Ziwen Ke, Zhuo-Xu Cui, Wenqi Huang, **g Cheng, Sen Jia, Haifeng Wang, Xin Liu, Hairong Zheng, Leslie Ying, Yanjie Zhu, Dong Liang

Abstract: Purpose: To develop a deep learning method on a nonlinear manifold to explore the temporal redundancy of dynamic signals to reconstruct cardiac MRI data from highly undersampled measurements. Methods: Cardiac MR image reconstruction is modeled as general compressed sensing (CS) based optimization on a low-rank tensor manifold. The nonlinear manifold is designed to characterize the temporal corre… ▽ More Purpose: To develop a deep learning method on a nonlinear manifold to explore the temporal redundancy of dynamic signals to reconstruct cardiac MRI data from highly undersampled measurements. Methods: Cardiac MR image reconstruction is modeled as general compressed sensing (CS) based optimization on a low-rank tensor manifold. The nonlinear manifold is designed to characterize the temporal correlation of dynamic signals. Iterative procedures can be obtained by solving the optimization model on the manifold, including gradient calculation, projection of the gradient to tangent space, and retraction of the tangent space to the manifold. The iterative procedures on the manifold are unrolled to a neural network, dubbed as Manifold-Net. The Manifold-Net is trained using in vivo data with a retrospective electrocardiogram (ECG)-gated segmented bSSFP sequence. Results: Experimental results at high accelerations demonstrate that the proposed method can obtain improved reconstruction compared with a compressed sensing (CS) method k-t SLR and two state-of-the-art deep learning-based methods, DC-CNN and CRNN. Conclusion: This work represents the first study unrolling the optimization on manifolds into neural networks. Specifically, the designed low-rank manifold provides a new technical route for applying low-rank priors in dynamic MR imaging. △ Less

Submitted 8 March, 2021; originally announced April 2021.

Comments: 17 pages, 7 figures

arXiv:2103.15964 [pdf, other]

doi 10.1140/epjc/s10052-021-09898-9

Heavy+light pseudoscalar meson semileptonic transitions

Authors: Zhen-Ni Xu, Zhu-Fang Cui, Craig D. Roberts, Chang Xu

Abstract: A symmetry-preserving regularisation of a vector$\times$vector contact interaction (SCI) is used to deliver a unified treatment of semileptonic transitions involving $π$, $K$, $D_{(s)}$, $B_{(s,c)}$ initial states. The framework is characterised by algebraic simplicity, few parameters, and the ability to simultaneously treat systems from Nambu-Goldstone modes to heavy+heavy mesons. Although the SC… ▽ More A symmetry-preserving regularisation of a vector$\times$vector contact interaction (SCI) is used to deliver a unified treatment of semileptonic transitions involving $π$, $K$, $D_{(s)}$, $B_{(s,c)}$ initial states. The framework is characterised by algebraic simplicity, few parameters, and the ability to simultaneously treat systems from Nambu-Goldstone modes to heavy+heavy mesons. Although the SCI form factors are typically somewhat stiff, the results are comparable with experiment and rigorous theory results. Hence, predictions for the five unmeasured $B_{s,c}$ branching fractions should be a reasonable guide. The analysis provides insights into the effects of Higgs boson couplings via current-quark masses on the transition form factors; and results on $B_{(s)}\to D_{(s)}$ transitions yield a prediction for the Isgur-Wise function in fair agreement with contemporary data. △ Less

Submitted 29 March, 2021; originally announced March 2021.

Comments: 17 pages, 13 figures, 3 tables

Report number: NJU-INP 040/21

arXiv:2103.06422 [pdf, other]

Holistic 3D Scene Understanding from a Single Image with Implicit Representation

Authors: Cheng Zhang, Zhaopeng Cui, Yinda Zhang, Bing Zeng, Marc Pollefeys, Shuaicheng Liu

Abstract: We present a new pipeline for holistic 3D scene understanding from a single image, which could predict object shapes, object poses, and scene layout. As it is a highly ill-posed problem, existing methods usually suffer from inaccurate estimation of both shapes and layout especially for the cluttered scene due to the heavy occlusion between objects. We propose to utilize the latest deep implicit re… ▽ More We present a new pipeline for holistic 3D scene understanding from a single image, which could predict object shapes, object poses, and scene layout. As it is a highly ill-posed problem, existing methods usually suffer from inaccurate estimation of both shapes and layout especially for the cluttered scene due to the heavy occlusion between objects. We propose to utilize the latest deep implicit representation to solve this challenge. We not only propose an image-based local structured implicit network to improve the object shape estimation, but also refine the 3D object pose and scene layout via a novel implicit scene graph neural network that exploits the implicit local object features. A novel physical violation loss is also proposed to avoid incorrect context between objects. Extensive experiments demonstrate that our method outperforms the state-of-the-art methods in terms of object shape, scene layout estimation, and 3D object detection. △ Less

Submitted 22 August, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

Comments: Published in CVPR 2021

arXiv:2103.06126 [pdf, other]

Spatial-Temporal Tensor Graph Convolutional Network for Traffic Prediction

Authors: Xuran Xu, Tong Zhang, Chunyan Xu, Zhen Cui, Jian Yang

Abstract: Accurate traffic prediction is crucial to the guidance and management of urban traffics. However, most of the existing traffic prediction models do not consider the computational burden and memory space when they capture spatial-temporal dependence among traffic data. In this work, we propose a factorized Spatial-Temporal Tensor Graph Convolutional Network to deal with traffic speed prediction. Tr… ▽ More Accurate traffic prediction is crucial to the guidance and management of urban traffics. However, most of the existing traffic prediction models do not consider the computational burden and memory space when they capture spatial-temporal dependence among traffic data. In this work, we propose a factorized Spatial-Temporal Tensor Graph Convolutional Network to deal with traffic speed prediction. Traffic networks are modeled and unified into a graph that integrates spatial and temporal information simultaneously. We further extend graph convolution into tensor space and propose a tensor graph convolution network to extract more discriminating features from spatial-temporal graph data. To reduce the computational burden, we take Tucker tensor decomposition and derive factorized a tensor convolution, which performs separate filtering in small-scale space, time, and feature modes. Besides, we can benefit from noise suppression of traffic data when discarding those trivial components in the process of tensor decomposition. Extensive experiments on two real-world traffic speed datasets demonstrate our method is more effective than those traditional traffic prediction methods, and meantime achieves state-of-the-art performance. △ Less

Submitted 10 March, 2021; originally announced March 2021.

arXiv:2103.04243 [pdf, other]

Estimating and Improving Fairness with Adversarial Learning

Authors: Xiaoxiao Li, Ziteng Cui, Yifan Wu, Lin Gu, Tatsuya Harada

Abstract: Fairness and accountability are two essential pillars for trustworthy Artificial Intelligence (AI) in healthcare. However, the existing AI model may be biased in its decision marking. To tackle this issue, we propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a disc… ▽ More Fairness and accountability are two essential pillars for trustworthy Artificial Intelligence (AI) in healthcare. However, the existing AI model may be biased in its decision marking. To tackle this issue, we propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We further impose an orthogonality regularization to force the two modules to be independent during training. Hence, we can keep these deep learning tasks distinct from one another, and avoid collapsing them into a singular point on the manifold. Through this adversarial training method, the data from the underprivileged group, which is vulnerable to bias because of attributes such as sex and skin tone, are transferred into a domain that is neutral relative to these attributes. Furthermore, the critical module can predict fairness scores for the data with unknown sensitive attributes. We evaluate our framework on a large-scale public-available skin lesion dataset under various fairness evaluation metrics. The experiments demonstrate the effectiveness of our proposed method for estimating and improving fairness in the deep learning-based medical image analysis system. △ Less

Submitted 11 May, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

Comments: 12 pages, 2 figures, 3 tables

arXiv:2103.01055 [pdf, other]

P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching

Authors: Bing Wang, Changhao Chen, Zhaopeng Cui, Jie Qin, Chris Xiaoxuan Lu, Zhengdi Yu, Peijun Zhao, Zhen Dong, Fan Zhu, Niki Trigoni, Andrew Markham

Abstract: Accurately describing and detecting 2D and 3D keypoints is crucial to establishing correspondences across images and point clouds. Despite a plethora of learning-based 2D or 3D local feature descriptors and detectors having been proposed, the derivation of a shared descriptor and joint keypoint detector that directly matches pixels and points remains under-explored by the community. This work take… ▽ More Accurately describing and detecting 2D and 3D keypoints is crucial to establishing correspondences across images and point clouds. Despite a plethora of learning-based 2D or 3D local feature descriptors and detectors having been proposed, the derivation of a shared descriptor and joint keypoint detector that directly matches pixels and points remains under-explored by the community. This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds. In order to directly match pixels and points, a dual fully convolutional framework is presented that maps 2D and 3D inputs into a shared latent representation space to simultaneously describe and detect keypoints. Furthermore, an ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions. Extensive experimental results demonstrate that our framework shows competitive performance in fine-grained matching between images and point clouds and achieves state-of-the-art results for the task of indoor visual localization. Our source code will be available at [no-name-for-blind-review]. △ Less

Submitted 29 July, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: ICCV 2021

arXiv:2102.12568 [pdf, other]

doi 10.1140/epjc/s10052-021-09097-6

Masses of positive- and negative-parity hadron ground-states, including those with heavy quarks

Authors: Pei-Lin Yin, Zhu-Fang Cui, Craig D. Roberts, Jorge Segovia

Abstract: A symmetry-preserving treatment of a vector$\times$vector contact interaction is used to compute spectra of ground-state $J^P = 0^\pm, 1^\pm$ $(f\bar g)$ mesons, their partner diquark correlations, and $J^P=1/2^\pm, 3/2^\pm$ $(fgh)$ baryons, where $f,g,h \in \{u,d,s,c,b\}$. Results for the leptonic decay constants of all mesons are also obtained, including scalar and pseudovector states involving… ▽ More A symmetry-preserving treatment of a vector$\times$vector contact interaction is used to compute spectra of ground-state $J^P = 0^\pm, 1^\pm$ $(f\bar g)$ mesons, their partner diquark correlations, and $J^P=1/2^\pm, 3/2^\pm$ $(fgh)$ baryons, where $f,g,h \in \{u,d,s,c,b\}$. Results for the leptonic decay constants of all mesons are also obtained, including scalar and pseudovector states involving heavy quarks. The spectrum of baryons produced by this chiefly algebraic approach reproduces the 64 masses known empirically or computed using lattice-regularised quantum chromodynamics with an accuracy of 1.4(1.2)%. It also has the richness of states typical of constituent-quark models and predicts many baryon states that have not yet been observed. The study indicates that dynamical, nonpointlike diquark correlations play an important role in all baryons; and, typically, the lightest allowed diquark is the most important component of a baryon's Faddeev amplitude. △ Less

Submitted 24 February, 2021; originally announced February 2021.

Comments: 21 pages, 7 tables, 5 figures. arXiv admin note: text overlap with arXiv:1903.00160

Report number: NJU-INP 037/21

arXiv:2102.11127 [pdf, other]

doi 10.1145/3442381.3450115

Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval

Authors: Xueli Yu, Weizhi Xu, Zeyu Cui, Shu Wu, Liang Wang

Abstract: The ad-hoc retrieval task is to rank related documents given a query and a document collection. A series of deep learning based approaches have been proposed to solve such problem and gained lots of attention. However, we argue that they are inherently based on local word sequences, ignoring the subtle long-distance document-level word relationships. To solve the problem, we explicitly model the d… ▽ More The ad-hoc retrieval task is to rank related documents given a query and a document collection. A series of deep learning based approaches have been proposed to solve such problem and gained lots of attention. However, we argue that they are inherently based on local word sequences, ignoring the subtle long-distance document-level word relationships. To solve the problem, we explicitly model the document-level word relationship through the graph structure, capturing the subtle information via graph neural networks. In addition, due to the complexity and scale of the document collections, it is considerable to explore the different grain-sized hierarchical matching signals at a more general level. Therefore, we propose a Graph-based Hierarchical Relevance Matching model (GHRM) for ad-hoc retrieval, by which we can capture the subtle and general hierarchical matching signals simultaneously. We validate the effects of GHRM over two representative ad-hoc retrieval benchmarks, the comprehensive experiments and results demonstrate its superiority over state-of-the-art methods. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: To appear at WWW 2021

arXiv:2102.09222 [pdf, other]

doi 10.1007/s11467-021-1062-0

Electron-Ion Collider in China

Authors: Daniele P. Anderle, Valerio Bertone, Xu Cao, Lei Chang, Ningbo Chang, Gu Chen, Xurong Chen, Zhuojun Chen, Zhufang Cui, Lingyun Dai, Weitian Deng, Minghui Ding, Xu Feng, Chang Gong, Longcheng Gui, Feng-Kun Guo, Chengdong Han, Jun He, Tie-Jiun Hou, Hongxia Huang, Yin Huang, Krešimir Kumerički, L. P. Kaptari, Demin Li, Hengne Li , et al. (77 additional authors not shown)

Abstract: Lepton scattering is an established ideal tool for studying inner structure of small particles such as nucleons as well as nuclei. As a future high energy nuclear physics project, an Electron-ion collider in China (EicC) has been proposed. It will be constructed based on an upgraded heavy-ion accelerator, High Intensity heavy-ion Accelerator Facility (HIAF) which is currently under construction, t… ▽ More Lepton scattering is an established ideal tool for studying inner structure of small particles such as nucleons as well as nuclei. As a future high energy nuclear physics project, an Electron-ion collider in China (EicC) has been proposed. It will be constructed based on an upgraded heavy-ion accelerator, High Intensity heavy-ion Accelerator Facility (HIAF) which is currently under construction, together with a new electron ring. The proposed collider will provide highly polarized electrons (with a polarization of $\sim$80%) and protons (with a polarization of $\sim$70%) with variable center of mass energies from 15 to 20 GeV and the luminosity of (2-3) $\times$ 10$^{33}$ cm$^{-2}$ s$^{-1}$. Polarized deuterons and Helium-3, as well as unpolarized ion beams from Carbon to Uranium, will be also available at the EicC. The main foci of the EicC will be precision measurements of the structure of the nucleon in the sea quark region, including 3D tomography of nucleon; the partonic structure of nuclei and the parton interaction with the nuclear environment; the exotic states, especially those with heavy flavor quark contents. In addition, issues fundamental to understanding the origin of mass could be addressed by measurements of heavy quarkonia near-threshold production at the EicC. In order to achieve the above-mentioned physics goals, a hermetical detector system will be constructed with cutting-edge technologies. This document is the result of collective contributions and valuable inputs from experts across the globe. The EicC physics program complements the ongoing scientific programs at the Jefferson Laboratory and the future EIC project in the United States. The success of this project will also advance both nuclear and particle physics as well as accelerator and detector technology in China. △ Less

Submitted 18 February, 2021; originally announced February 2021.

Comments: EicC white paper, written by the whole EicC working group

Report number: Frontiers of Physics, Volume 16 Issue (6):64701, 2021

Journal ref: Frontiers of Physics, Volume 16 Issue (6):64701, 2021

arXiv:2102.05206 [pdf]

doi 10.1103/PhysRevResearch.4.023170

Microwave Sensing of Andreev Bound States in a Gate-Defined Superconducting Quantum Point Contact

Authors: Vivek Chidambaram, Anders Kringhøj, Lucas Casparis, Ferdinand Kuemmeth, Tiantian Wang, Candice Thomas, Sergei Gronin, Geoffrey C. Gardner, Zhengyi Cui, Chenlu Liu, Kristof Moors, Michael J. Manfra, Karl D. Petersson, Malcolm R. Connolly

Abstract: We use a superconducting microresonator as a cavity to sense absorption of microwaves by a superconducting quantum point contact defined by surface gates over a proximitized two-dimensional electron gas. Renormalization of the cavity frequency with phase difference across the point contact is consistent with adiabatic coupling to Andreev bound states. Near $π$ phase difference, we observe random f… ▽ More We use a superconducting microresonator as a cavity to sense absorption of microwaves by a superconducting quantum point contact defined by surface gates over a proximitized two-dimensional electron gas. Renormalization of the cavity frequency with phase difference across the point contact is consistent with adiabatic coupling to Andreev bound states. Near $π$ phase difference, we observe random fluctuations in absorption with gate voltage, related to quantum interference-induced modulations in the electron transmission. We identify features consistent with the presence of single Andreev bound states and describe the Andreev-cavity interaction using a dispersive Jaynes-Cummings model. By fitting the weak Andreev-cavity coupling, we extract ~GHz decoherence consistent with charge noise and the transmission dispersion associated with a localized state. △ Less

Submitted 1 September, 2022; v1 submitted 9 February, 2021; originally announced February 2021.

Report number: NBI QDEV 2021

Journal ref: Phys. Rev. Research 4, 023170 (2022)

arXiv:2102.01180 [pdf, other]

doi 10.1103/PhysRevLett.127.092001

Fresh extraction of the proton charge radius from electron scattering

Authors: Zhu-Fang Cui, Daniele Binosi, Craig D. Roberts, Sebastian M. Schmidt

Abstract: We present a novel method for extracting the proton radius from elastic electron-proton ($ep$) scattering data. The approach is based on interpolation via continued fractions augmented by statistical sampling and avoids any assumptions on the form of function used for the representation of data and subsequent extrapolation onto $Q^2\simeq 0$. Applying the method to extant modern $e p$ data sets, w… ▽ More We present a novel method for extracting the proton radius from elastic electron-proton ($ep$) scattering data. The approach is based on interpolation via continued fractions augmented by statistical sampling and avoids any assumptions on the form of function used for the representation of data and subsequent extrapolation onto $Q^2\simeq 0$. Applying the method to extant modern $e p$ data sets, we find that all results are mutually consistent and, combining them, arrive at $r_p=0.847(8)\,$fm. This result compares favourably with values obtained from contemporary measurements of the Lamb shift in muonic hydrogen, transitions in electronic hydrogen, and muonic deuterium spectroscopy. △ Less

Submitted 18 July, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

Comments: 7 pages, 4 figures. Version accepted for publication in Phys. Rev. Lett

Report number: NJU-INP 033/21

Journal ref: Phys. Rev. Lett. 127, 092001 (2021)

arXiv:2101.12286 [pdf, other]

doi 10.1016/j.physletb.2021.136158

Measures of pion and kaon structure from generalised parton distributions

Authors: **-Li Zhang, Khépani Raya, Lei Chang, Zhu-Fang Cui, José Manuel Morgado, Craig D. Roberts, José Rodríguez-Quintero

Abstract: Pion and kaon structural properties provide insights into the emergence of mass within the Standard Model and attendant modulations by the Higgs boson. Novel expressions of these effects, in impact parameter space and in mass and pressure profiles, are exposed via $π$ and $K$ generalised parton distributions, built using the overlap representation from light-front wave functions constrained by one… ▽ More Pion and kaon structural properties provide insights into the emergence of mass within the Standard Model and attendant modulations by the Higgs boson. Novel expressions of these effects, in impact parameter space and in mass and pressure profiles, are exposed via $π$ and $K$ generalised parton distributions, built using the overlap representation from light-front wave functions constrained by one-dimensional valence distribution functions that describe available data. Notably, e.g. $K$ pressure profiles are spatially more compact than $π$ profiles and both achieve near-core pressures of similar magnitude to that found in neutron stars. △ Less

Submitted 17 February, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

Comments: 9 pages, 6 figures, 2 tables. Accepted for publication in Phys. Lett. B

Report number: NJU-INP 032/21

arXiv:2101.11873 [pdf, other]

A Graph-based Relevance Matching Model for Ad-hoc Retrieval

Authors: Yufeng Zhang, **ghao Zhang, Zeyu Cui, Shu Wu, Liang Wang

Abstract: To retrieve more relevant, appropriate and useful documents given a query, finding clues about that query through the text is crucial. Recent deep learning models regard the task as a term-level matching problem, which seeks exact or similar query patterns in the document. However, we argue that they are inherently based on local interactions and do not generalise to ubiquitous, non-consecutive co… ▽ More To retrieve more relevant, appropriate and useful documents given a query, finding clues about that query through the text is crucial. Recent deep learning models regard the task as a term-level matching problem, which seeks exact or similar query patterns in the document. However, we argue that they are inherently based on local interactions and do not generalise to ubiquitous, non-consecutive contextual relationships. In this work, we propose a novel relevance matching model based on graph neural networks to leverage the document-level word relationships for ad-hoc retrieval. In addition to the local interactions, we explicitly incorporate all contexts of a term through the graph-of-word text format. Matching patterns can be revealed accordingly to provide a more accurate relevance score. Our approach significantly outperforms strong baselines on two ad-hoc benchmarks. We also experimentally compare our model with BERT and show our advantages on long documents. △ Less

Submitted 28 January, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

Comments: To appear at AAAI 2021

arXiv:2101.06567 [pdf]

Back-n White Neutron Source at CSNS and its Applications

Authors: The CSNS Back-n Collaboration, :, **g-Yu Tang, Qi An, Jiang-Bo Bai, Jie Bao, Yu Bao, ** Cao, Hao-Lei Chen, Qi-** Chen, Yong-Hao Chen, Zhen Chen, Zeng-Qi Cui, Rui-Rui Fan, Chang-Qing Feng, Ke-Qing Gao, Xiao-Long Gao, Min-Hao Gu, Chang-Cai Han, Zi-Jie Han, Guo-Zhu He, Yong-Cheng He, Yang Hong, Yi-Wei Hu, Han-Xiong Huang , et al. (52 additional authors not shown)

Abstract: Back-streaming neutrons from the spallation target of the China Spallation Neutron Source (CSNS) that emit through the incoming proton channel were exploited to build a white neutron beam facility (the so-called Back-n white neutron source), which was completed in March 2018. The Back-n neutron beam is very intense, at approximately 2*10^7 n/cm^2/s at 55 m from the target, and has a nominal proton… ▽ More Back-streaming neutrons from the spallation target of the China Spallation Neutron Source (CSNS) that emit through the incoming proton channel were exploited to build a white neutron beam facility (the so-called Back-n white neutron source), which was completed in March 2018. The Back-n neutron beam is very intense, at approximately 2*10^7 n/cm^2/s at 55 m from the target, and has a nominal proton beam with a power of 100 kW in the CSNS-I phase and a kinetic energy of 1.6 GeV and a thick tungsten target in multiple slices with modest moderation from the cooling water through the slices. In addition, the excellent energy spectrum spanning from 0.5 eV to 200 MeV, and a good time resolution related to the time-of-flight measurements make it a typical white neutron source for nuclear data measurements; its overall performance is among that of the best white neutron sources in the world. Equipped with advanced spectrometers, detectors, and application utilities, the Back-n facility can serve wide applications, with a focus on neutron-induced cross-section measurements. This article presents an overview of the neutron beam characteristics, the experimental setups, and the ongoing applications at Back-n. △ Less

Submitted 16 January, 2021; originally announced January 2021.

Comments: 11 pages, 9 figures

arXiv:2012.13697 [pdf, other]

TSGCNet: Discriminative Geometric Feature Learning with Two-Stream GraphConvolutional Network for 3D Dental Model Segmentation

Authors: Lingming Zhang, Yue Zhao, Deyu Meng, Zhiming Cui, Chenqiang Gao, Xinbo Gao, Chunfeng Lian, Dinggang Shen

Abstract: The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. To date, deep learning based methods have been popularly used to handle this task. State-of-the-art methods directly concatenate the raw attributes of 3D inputs, namely coordinates and normal vectors of mesh cells, to train a single-stream network for fully-a… ▽ More The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. To date, deep learning based methods have been popularly used to handle this task. State-of-the-art methods directly concatenate the raw attributes of 3D inputs, namely coordinates and normal vectors of mesh cells, to train a single-stream network for fully-automated tooth segmentation. This, however, has the drawback of ignoring the different geometric meanings provided by those raw attributes. This issue might possibly confuse the network in learning discriminative geometric features and result in many isolated false predictions on the dental model. Against this issue, we propose a two-stream graph convolutional network (TSGCNet) to learn multi-view geometric information from different geometric attributes. Our TSGCNet adopts two graph-learning streams, designed in an input-aware fashion, to extract more discriminative high-level geometric representations from coordinates and normal vectors, respectively. These feature representations learned from the designed two different streams are further fused to integrate the multi-view complementary information for the cell-wise dense prediction task. We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners, and experimental results demonstrate that our method significantly outperforms state-of-the-art methods for 3D shape segmentation. △ Less

Submitted 26 December, 2020; originally announced December 2020.

Comments: 10 pages, 7 figures

arXiv:2012.08117 [pdf, other]

Writing Polishment with Simile: Task, Dataset and A Neural Approach

Authors: Jiayi Zhang, Zhi Cui, Xiaoqiang Xia, Yalong Guo, Yanran Li, Chen Wei, Jianwei Cui

Abstract: A simile is a figure of speech that directly makes a comparison, showing similarities between two different things, e.g. "Reading papers can be dull sometimes,like watching grass grow". Human writers often interpolate appropriate similes into proper locations of the plain text to vivify their writings. However, none of existing work has explored neural simile interpolation, including both locating… ▽ More A simile is a figure of speech that directly makes a comparison, showing similarities between two different things, e.g. "Reading papers can be dull sometimes,like watching grass grow". Human writers often interpolate appropriate similes into proper locations of the plain text to vivify their writings. However, none of existing work has explored neural simile interpolation, including both locating and generation. In this paper, we propose a new task of Writing Polishment with Simile (WPS) to investigate whether machines are able to polish texts with similes as we human do. Accordingly, we design a two-staged Locate&Gen model based on transformer architecture. Our model firstly locates where the simile interpolation should happen, and then generates a location-specific simile. We also release a large-scale Chinese Simile (CS) dataset containing 5 million similes with context. The experimental results demonstrate the feasibility of WPS task and shed light on the future research directions towards better automatic text polishment. △ Less

Submitted 15 December, 2020; originally announced December 2020.

Comments: Accepted in AAAI2021

arXiv:2012.07410 [pdf, other]

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Authors: Xiuying Chen, Zhi Cui, Jiayi Zhang, Chen Wei, Jianwei Cui, Bin Wang, Dongyan Zhao, Rui Yan

Abstract: In multi-turn dialog, utterances do not always take the full form of sentences \cite{Carbonell1983DiscoursePA}, which naturally makes understanding the dialog context more difficult. However, it is essential to fully grasp the dialog context to generate a reasonable response. Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer… ▽ More In multi-turn dialog, utterances do not always take the full form of sentences \cite{Carbonell1983DiscoursePA}, which naturally makes understanding the dialog context more difficult. However, it is essential to fully grasp the dialog context to generate a reasonable response. Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog. Enlightened by the multi-task learning scheme, we propose a joint framework that unifies these two tasks, sharing the same encoder to extract the common and task-invariant features with different decoders to learn task-specific features. To better fusing information from the question and the dialog history in the encoding part, we propose to augment the Transformer architecture with a memory updater, which is designed to selectively store and update the history dialog information so as to support downstream tasks. For the experiment, we employ human annotators to write and examine a large-scale dialog reading comprehension dataset. Extensive experiments are conducted on this dataset, and the results show that the proposed model brings substantial improvements over several strong baselines on both tasks. In this way, we demonstrate that reasoning can indeed help better response generation and vice versa. We release our large-scale dataset for further research. △ Less

Submitted 14 December, 2020; originally announced December 2020.

Comments: 9 pages, 1 figure

Journal ref: AAAI 2021

arXiv:2012.06707 [pdf, other]

Channel Modeling for UAV Communications: State of the Art, Case Studies, and Future Directions

Authors: Zhuangzhuang Cui, Ke Guan, César Briso-Rodríguez, Bo Ai, Zhangdui Zhong, Claude Oestges

Abstract: As essential aerial platforms, unmanned aerial vehicles (UAVs) play an increasingly important role in broad wireless connectivity and high-data-rate transmission for future communication systems. Notably, various communication scenarios are involved in UAV communications, such as intercommunications between UAVs and communications with the ground user equipment, the cellular base station, and the… ▽ More As essential aerial platforms, unmanned aerial vehicles (UAVs) play an increasingly important role in broad wireless connectivity and high-data-rate transmission for future communication systems. Notably, various communication scenarios are involved in UAV communications, such as intercommunications between UAVs and communications with the ground user equipment, the cellular base station, and the ground station, to name a few. However, existing works mostly focus on a single communication scenario, a designated channel type, and a specific operating frequency, thus urgently requiring a comprehensive understanding of multi-scenario, multi-frequency, and multi-type UAV channels. This article pours attention into the essentials of corresponding air-to-air (A2A) and air-to-ground (A2G) channels in UAV communications. We first identify the latest key challenges of channel modeling for UAV communications. We then provide the state of the art for A2A and A2G channel properties and models based on extensive measurement campaigns. In particular, we conduct realistic case studies to further demonstrate critical channel characterizations and machine learning-based modeling methods. Last but not least, potential directions are widely discussed for paving the way towards more accurate and effective channel models for UAV communications. △ Less

Submitted 16 April, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

arXiv:2012.06628 [pdf, other]

Sat2Vid: Street-view Panoramic Video Synthesis from a Single Satellite Image

Authors: Zuoyue Li, Zhenqiang Li, Zhaopeng Cui, Rongjun Qin, Marc Pollefeys, Martin R. Oswald

Abstract: We present a novel method for synthesizing both temporally and geometrically consistent street-view panoramic video from a single satellite image and camera trajectory. Existing cross-view synthesis approaches focus on images, while video synthesis in such a case has not yet received enough attention. For geometrical and temporal consistency, our approach explicitly creates a 3D point cloud repres… ▽ More We present a novel method for synthesizing both temporally and geometrically consistent street-view panoramic video from a single satellite image and camera trajectory. Existing cross-view synthesis approaches focus on images, while video synthesis in such a case has not yet received enough attention. For geometrical and temporal consistency, our approach explicitly creates a 3D point cloud representation of the scene and maintains dense 3D-2D correspondences across frames that reflect the geometric scene configuration inferred from the satellite view. As for synthesis in the 3D space, we implement a cascaded network architecture with two hourglass modules to generate point-wise coarse and fine features from semantics and per-class latent vectors, followed by projection to frames and an upsampling module to obtain the final realistic video. By leveraging computed correspondences, the produced street-view video frames adhere to the 3D geometric scene structure and maintain temporal consistency. Qualitative and quantitative experiments demonstrate superior results compared to other state-of-the-art synthesis approaches that either lack temporal consistency or realistic appearance. To the best of our knowledge, our work is the first one to synthesize cross-view images to video. △ Less

Submitted 5 May, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

Comments: Technical Report

arXiv:2012.03171 [pdf, other]

doi 10.1109/TVT.2021.3063408

Coverage Probability Analysis of IRS-Aided Communication Systems

Authors: Zhuangzhuang Cui, Ke Guan, Jiayi Zhang, Zhangdui Zhong

Abstract: The intelligent reflective surface (IRS) technology has received many interests in recent years, thanks to its potential uses in future wireless communications, in which one of the promising use cases is to widen coverage, especially in the line-of-sight-blocked scenarios. Therefore, it is critical to analyze the corresponding coverage probability of IRS-aided communication systems. To our best kn… ▽ More The intelligent reflective surface (IRS) technology has received many interests in recent years, thanks to its potential uses in future wireless communications, in which one of the promising use cases is to widen coverage, especially in the line-of-sight-blocked scenarios. Therefore, it is critical to analyze the corresponding coverage probability of IRS-aided communication systems. To our best knowledge, however, previous works focusing on this issue are very limited. In this paper, we analyze the coverage probability under the Rayleigh fading channel, taking the number and size of the array elements into consideration. We first derive the exact closed-form of coverage probability for the unit element. Afterward, with the method of moment matching, the approximation of the coverage probability can be formulated as the ratio of upper incomplete Gamma function and Gamma function, allowing an arbitrary number of elements. Finally, we comprehensively evaluate the impacts of essential factors on the coverage probability, such as the coefficient of fading channel, the number and size of the element, and the angle of incidence. Overall, the paper provides a succinct and general expression of coverage probability, which can be helpful in the performance evaluation and practical implementation of the IRS. △ Less

Submitted 5 December, 2020; originally announced December 2020.

arXiv:2011.13341 [pdf, other]

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Authors: Miao Liu, Dexin Yang, Yan Zhang, Zhaopeng Cui, James M. Rehg, Siyu Tang

Abstract: We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos. The unique viewpoint and rapid embodied camera motion of egocentric videos raise additional technical barriers for human body capture. To address those challenges, we propose a simple yet effective optimization-based approach that leverages 2D observations of the entire… ▽ More We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos. The unique viewpoint and rapid embodied camera motion of egocentric videos raise additional technical barriers for human body capture. To address those challenges, we propose a simple yet effective optimization-based approach that leverages 2D observations of the entire video sequence and human-scene interaction constraint to estimate second-person human poses, shapes, and global motion that are grounded on the 3D environment captured from the egocentric view. We conduct detailed ablation studies to validate our design choice. Moreover, we compare our method with the previous state-of-the-art method on human motion capture from monocular video, and show that our method estimates more accurate human-body poses and shapes under the challenging egocentric setting. In addition, we demonstrate that our approach produces more realistic human-scene interaction. △ Less

Submitted 15 October, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

Showing 251–300 of 494 results for author: Cui, Z