Search | arXiv e-print repository

What Are Expected Queries in End-to-End Object Detection?

Authors: Shilong Zhang, Xinjiang Wang, Jiaqi Wang, Jiangmiao Pang, Kai Chen

Abstract: End-to-end object detection is rapidly progressed after the emergence of DETR. DETRs use a set of sparse queries that replace the dense candidate boxes in most traditional detectors. In comparison, the sparse queries cannot guarantee a high recall as dense priors. However, making queries dense is not trivial in current frameworks. It not only suffers from heavy computational cost but also difficul… ▽ More End-to-end object detection is rapidly progressed after the emergence of DETR. DETRs use a set of sparse queries that replace the dense candidate boxes in most traditional detectors. In comparison, the sparse queries cannot guarantee a high recall as dense priors. However, making queries dense is not trivial in current frameworks. It not only suffers from heavy computational cost but also difficult optimization. As both sparse and dense queries are imperfect, then \emph{what are expected queries in end-to-end object detection}? This paper shows that the expected queries should be Dense Distinct Queries (DDQ). Concretely, we introduce dense priors back to the framework to generate dense queries. A duplicate query removal pre-process is applied to these queries so that they are distinguishable from each other. The dense distinct queries are then iteratively processed to obtain final sparse outputs. We show that DDQ is stronger, more robust, and converges faster. It obtains 44.5 AP on the MS COCO detection dataset with only 12 epochs. DDQ is also robust as it outperforms previous methods on both object detection and instance segmentation tasks on various datasets. DDQ blends advantages from traditional dense priors and recent end-to-end detectors. We hope it can serve as a new baseline and inspires researchers to revisit the complementarity between traditional methods and end-to-end detectors. The source code is publicly available at \url{https://github.com/jshilong/DDQ}. △ Less

Submitted 2 June, 2022; originally announced June 2022.

Comments: The source code is publicly available at https://github.com/jshilong/DDQ

arXiv:2205.14727 [pdf, other]

CPED: A Large-Scale Chinese Personalized and Emotional Dialogue Dataset for Conversational AI

Authors: Yirong Chen, Weiquan Fan, Xiaofen Xing, Jianxin Pang, Minlie Huang, Wen**g Han, Qianfeng Tie, Xiangmin Xu

Abstract: Human language expression is based on the subjective construal of the situation instead of the objective truth conditions, which means that speakers' personalities and emotions after cognitive processing have an important influence on conversation. However, most existing datasets for conversational AI ignore human personalities and emotions, or only consider part of them. It's difficult for dialog… ▽ More Human language expression is based on the subjective construal of the situation instead of the objective truth conditions, which means that speakers' personalities and emotions after cognitive processing have an important influence on conversation. However, most existing datasets for conversational AI ignore human personalities and emotions, or only consider part of them. It's difficult for dialogue systems to understand speakers' personalities and emotions although large-scale pre-training language models have been widely used. In order to consider both personalities and emotions in the process of conversation generation, we propose CPED, a large-scale Chinese personalized and emotional dialogue dataset, which consists of multi-source knowledge related to empathy and personal characteristic. These knowledge covers gender, Big Five personality traits, 13 emotions, 19 dialogue acts and 10 scenes. CPED contains more than 12K dialogues of 392 speakers from 40 TV shows. We release the textual dataset with audio features and video features according to the copyright claims, privacy issues, terms of service of video platforms. We provide detailed description of the CPED construction process and introduce three tasks for conversational AI, including personality recognition, emotion recognition in conversations as well as personalized and emotional conversation generation. Finally, we provide baseline systems for these tasks and consider the function of speakers' personalities and emotions on conversation. Our motivation is to propose a dataset to be widely adopted by the NLP community as a new open benchmark for conversational AI research. The full dataset is available at https://github.com/scutcyr/CPED. △ Less

Submitted 29 May, 2022; originally announced May 2022.

arXiv:2205.09389 [pdf, other]

Simplifying Node Classification on Heterophilous Graphs with Compatible Label Propagation

Authors: Zhiqiang Zhong, Sergey Ivanov, Jun Pang

Abstract: Graph Neural Networks (GNNs) have been predominant for graph learning tasks; however, recent studies showed that a well-known graph algorithm, Label Propagation (LP), combined with a shallow neural network can achieve comparable performance to GNNs in semi-supervised node classification on graphs with high homophily. In this paper, we show that this approach falls short on graphs with low homophil… ▽ More Graph Neural Networks (GNNs) have been predominant for graph learning tasks; however, recent studies showed that a well-known graph algorithm, Label Propagation (LP), combined with a shallow neural network can achieve comparable performance to GNNs in semi-supervised node classification on graphs with high homophily. In this paper, we show that this approach falls short on graphs with low homophily, where nodes often connect to the nodes of the opposite classes. To overcome this, we carefully design a combination of a base predictor with LP algorithm that enjoys a closed-form solution as well as convergence guarantees. Our algorithm first learns the class compatibility matrix and then aggregates label predictions using LP algorithm weighted by class compatibilities. On a wide variety of benchmarks, we show that our approach achieves the leading performance on graphs with various levels of homophily. Meanwhile, it has orders of magnitude fewer parameters and requires less execution time. Empirical evaluations demonstrate that simple adaptations of LP can be competitive in semi-supervised node classification in both homophily and heterophily regimes. △ Less

Submitted 30 November, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

arXiv:2204.06779 [pdf, other]

3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume

Authors: Jianye Pang, Cheng Jiang, Yihao Chen, Jianbo Chang, Ming Feng, Renzhi Wang, Jianhua Yao

Abstract: Dense prediction in medical volume provides enriched guidance for clinical analysis. CNN backbones have met bottleneck due to lack of long-range dependencies and global context modeling power. Recent works proposed to combine vision transformer with CNN, due to its strong global capture ability and learning capability. However, most works are limited to simply applying pure transformer with severa… ▽ More Dense prediction in medical volume provides enriched guidance for clinical analysis. CNN backbones have met bottleneck due to lack of long-range dependencies and global context modeling power. Recent works proposed to combine vision transformer with CNN, due to its strong global capture ability and learning capability. However, most works are limited to simply applying pure transformer with several fatal flaws (i.e., lack of inductive bias, heavy computation and little consideration for 3D data). Therefore, designing an elegant and efficient vision transformer learner for dense prediction in medical volume is promising and challenging. In this paper, we propose a novel 3D Shuffle-Mixer network of a new Local Vision Transformer-MLP paradigm for medical dense prediction. In our network, a local vision transformer block is utilized to shuffle and learn spatial context from full-view slices of rearranged volume, a residual axial-MLP is designed to mix and capture remaining volume context in a slice-aware manner, and a MLP view aggregator is employed to project the learned full-view rich context to the volume feature in a view-aware manner. Moreover, an Adaptive Scaled Enhanced Shortcut is proposed for local vision transformer to enhance feature along spatial and channel dimensions adaptively, and a CrossMerge is proposed to skip-connects the multi-scale feature appropriately in the pyramid architecture. Extensive experiments demonstrate the proposed model outperforms other state-of-the-art medical dense prediction methods. △ Less

Submitted 14 April, 2022; originally announced April 2022.

arXiv:2204.05445 [pdf, other]

Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness

Authors: Dianwen Ng, ** Hui Pang, Yang Xiao, Biao Tian, Qiang Fu, Eng Siong Chng

Abstract: It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we pres… ▽ More It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we present a multi-channel ConvMixer for speech command recognitions. The novel architecture introduces an additional audio channel mixing for channel audio interaction in a multi-channel audio setting to achieve better noise-robust features with more efficient computation. Besides, we proposed a centroid based awareness component to enhance the system by equip** it with additional spatial geometry information in the latent feature projection space. We evaluate our model using the new MISP challenge 2021 dataset. Our model achieves significant improvement against the official baseline with a 55% gain in the competition score (0.152) on raw microphone array input and a 63% (0.126) boost upon front-end speech enhancement. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: submitted to INTERSPEECH 2022

arXiv:2204.04807 [pdf, other]

doi 10.1007/JHEP07(2022)019

Spurious poles in a finite volume

Authors: **-Yi Pang, Martin Ebert, Hans-Werner Hammer, Fabian Müller, Akaki Rusetsky, Jia-Jun Wu

Abstract: Using effective-range expansion for the two-body amplitudes may generate spurious sub-threshold poles outside of the convergence range of the expansion. In the infinite volume, the emergence of such poles leads to the inconsistencies in the three-body equations, e.g., to the breakdown of unitarity. We investigate the effect of the spurious poles on the three-body quantization condition in a finite… ▽ More Using effective-range expansion for the two-body amplitudes may generate spurious sub-threshold poles outside of the convergence range of the expansion. In the infinite volume, the emergence of such poles leads to the inconsistencies in the three-body equations, e.g., to the breakdown of unitarity. We investigate the effect of the spurious poles on the three-body quantization condition in a finite volume and show that it leads to a peculiar dependence of the energy levels on the box size $L$. Furthermore, within a simple model, it is demonstrated that the procedure for the removal of these poles, which was recently proposed in Ref.[1] in the infinite volume, can be adapted to the finite-volume calculations. The structure of the exact energy levels is reproduced with an accuracy that systematically improves order by order in the EFT expansion. △ Less

Submitted 10 April, 2022; originally announced April 2022.

Comments: 26 pages, 9 figures

arXiv:2204.04656 [pdf, other]

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation

Authors: Xiangtai Li, Wenwei Zhang, Jiangmiao Pang, Kai Chen, Guangliang Cheng, Yunhai Tong, Chen Change Loy

Abstract: This paper presents Video K-Net, a simple, strong, and unified framework for fully end-to-end video panoptic segmentation. The method is built upon K-Net, a method that unifies image segmentation via a group of learnable kernels. We observe that these learnable kernels from K-Net, which encode object appearances and contexts, can naturally associate identical instances across video frames. Motivat… ▽ More This paper presents Video K-Net, a simple, strong, and unified framework for fully end-to-end video panoptic segmentation. The method is built upon K-Net, a method that unifies image segmentation via a group of learnable kernels. We observe that these learnable kernels from K-Net, which encode object appearances and contexts, can naturally associate identical instances across video frames. Motivated by this observation, Video K-Net learns to simultaneously segment and track "things" and "stuff" in a video with simple kernel-based appearance modeling and cross-temporal kernel interaction. Despite the simplicity, it achieves state-of-the-art video panoptic segmentation results on Citscapes-VPS, KITTI-STEP, and VIPSeg without bells and whistles. In particular, on KITTI-STEP, the simple method can boost almost 12\% relative improvements over previous methods. On VIPSeg, Video K-Net boosts almost 15\% relative improvements and results in 39.8 % VPQ. We also validate its generalization on video semantic segmentation, where we boost various baselines by 2\% on the VSPW dataset. Moreover, we extend K-Net into clip-level video framework for video instance segmentation, where we obtain 40.5% mAP for ResNet50 backbone and 54.1% mAP for Swin-base on YouTube-2019 validation set. We hope this simple, yet effective method can serve as a new, flexible baseline in unified video segmentation design. Both code and models are released at https://github.com/lxtGH/Video-K-Net. △ Less

Submitted 19 October, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

Comments: Accepted by CVPR-2022(oral); Add more experiments. Code is available at https://github.com/lxtGH/Video-K-Net

arXiv:2203.14360 [pdf, other]

Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking

Authors: **kun Cao, Jiangmiao Pang, Xinshuo Weng, Rawal Khirodkar, Kris Kitani

Abstract: Kalman filter (KF) based methods for multi-object tracking (MOT) make an assumption that objects move linearly. While this assumption is acceptable for very short periods of occlusion, linear estimates of motion for prolonged time can be highly inaccurate. Moreover, when there is no measurement available to update Kalman filter parameters, the standard convention is to trust the priori state estim… ▽ More Kalman filter (KF) based methods for multi-object tracking (MOT) make an assumption that objects move linearly. While this assumption is acceptable for very short periods of occlusion, linear estimates of motion for prolonged time can be highly inaccurate. Moreover, when there is no measurement available to update Kalman filter parameters, the standard convention is to trust the priori state estimations for posteriori update. This leads to the accumulation of errors during a period of occlusion. The error causes significant motion direction variance in practice. In this work, we show that a basic Kalman filter can still obtain state-of-the-art tracking performance if proper care is taken to fix the noise accumulated during occlusion. Instead of relying only on the linear state estimate (i.e., estimation-centric approach), we use object observations (i.e., the measurements by object detector) to compute a virtual trajectory over the occlusion period to fix the error accumulation of filter parameters during the occlusion period. This allows more time steps to correct errors accumulated during occlusion. We name our method Observation-Centric SORT (OC-SORT). It remains Simple, Online, and Real-Time but improves robustness during occlusion and non-linear motion. Given off-the-shelf detections as input, OC-SORT runs at 700+ FPS on a single CPU. It achieves state-of-the-art on multiple datasets, including MOT17, MOT20, KITTI, head tracking, and especially DanceTrack where the object motion is highly non-linear. The code and models are available at \url{https://github.com/noahcao/OC_SORT}. △ Less

Submitted 15 March, 2023; v1 submitted 27 March, 2022; originally announced March 2022.

Comments: Accepted by CVPR 2023. 8 pages + 10 pages of appendix. Renamed OOS as Observation-centric Re-Update (ORU)

arXiv:2203.12371 [pdf]

High Phonon Scattering Rates Suppress Thermal Conductivity in Hyperstoichiometric Uranium Dioxide

Authors: Hao Ma, Matthew S. Bryan, Judy W. L. Pang, Douglas L. Abernathy, Daniel J. Antonio, Krzysztof Gofryk, Michael E. Manley

Abstract: Uranium dioxide (UO$_2$), one of the most important nuclear fuels, can accumulate excess oxygen atoms as interstitial defects, which significantly impacts thermal properties. In this study, thermal conductivities and inelastic neutron scattering measurements on UO$_2$ and UO$_{2+x}$ (x=0.3, 0.4, 0.8, 0.11) were performed at low temperatures (2-300 K). The thermal conductivity of UO$_{2+x}$ is sign… ▽ More Uranium dioxide (UO$_2$), one of the most important nuclear fuels, can accumulate excess oxygen atoms as interstitial defects, which significantly impacts thermal properties. In this study, thermal conductivities and inelastic neutron scattering measurements on UO$_2$ and UO$_{2+x}$ (x=0.3, 0.4, 0.8, 0.11) were performed at low temperatures (2-300 K). The thermal conductivity of UO$_{2+x}$ is significantly suppressed compared to UO$_2$ except near the Néel temperature TN= 30.8 K, where it is independent of x. Phonon measurements demonstrate that the heat capacities and phonon group velocities of UO$_2$ and UO$_{2+x}$ are similar and that the suppressed thermal conductivity in UO$_{2+x}$ results from high phonon scattering rates. These new insights advance our fundamental understanding of thermal transport properties in advanced nuclear fuels. △ Less

Submitted 23 March, 2022; originally announced March 2022.

arXiv:2203.11075 [pdf, other]

Dense Siamese Network for Dense Unsupervised Learning

Authors: Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

Abstract: This paper presents Dense Siamese Network (DenseSiam), a simple unsupervised learning framework for dense prediction tasks. It learns visual representations by maximizing the similarity between two views of one image with two types of consistency, i.e., pixel consistency and region consistency. Concretely, DenseSiam first maximizes the pixel level spatial consistency according to the exact locatio… ▽ More This paper presents Dense Siamese Network (DenseSiam), a simple unsupervised learning framework for dense prediction tasks. It learns visual representations by maximizing the similarity between two views of one image with two types of consistency, i.e., pixel consistency and region consistency. Concretely, DenseSiam first maximizes the pixel level spatial consistency according to the exact location correspondence in the overlapped area. It also extracts a batch of region embeddings that correspond to some sub-regions in the overlapped area to be contrasted for region consistency. In contrast to previous methods that require negative pixel pairs, momentum encoders or heuristic masks, DenseSiam benefits from the simple Siamese network and optimizes the consistency of different granularities. It also proves that the simple location correspondence and interacted region embeddings are effective enough to learn the similarity. We apply DenseSiam on ImageNet and obtain competitive improvements on various downstream tasks. We also show that only with some extra task-specific losses, the simple framework can directly conduct dense prediction tasks. On an existing unsupervised semantic segmentation benchmark, it surpasses state-of-the-art segmentation methods by 2.1 mIoU with 28% training costs. Code and models are released at https://github.com/ZwwWayne/DenseSiam. △ Less

Submitted 10 August, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

Comments: ECCV2022 camera ready

arXiv:2203.10866 [pdf, other]

Unsupervised Network Embedding Beyond Homophily

Authors: Zhiqiang Zhong, Guadalupe Gonzalez, Daniele Grattarola, Jun Pang

Abstract: Network embedding (NE) approaches have emerged as a predominant technique to represent complex networks and have benefited numerous tasks. However, most NE approaches rely on a homophily assumption to learn embeddings with the guidance of supervisory signals, leaving the unsupervised heterophilous scenario relatively unexplored. This problem becomes especially relevant in fields where a scarcity o… ▽ More Network embedding (NE) approaches have emerged as a predominant technique to represent complex networks and have benefited numerous tasks. However, most NE approaches rely on a homophily assumption to learn embeddings with the guidance of supervisory signals, leaving the unsupervised heterophilous scenario relatively unexplored. This problem becomes especially relevant in fields where a scarcity of labels exists. Here, we formulate the unsupervised NE task as an r-ego network discrimination problem and develop the SELENE framework for learning on networks with homophily and heterophily. Specifically, we design a dual-channel feature embedding pipeline to discriminate r-ego networks using node attributes and structural information separately. We employ heterophily adapted self-supervised learning objective functions to optimise the framework to learn intrinsic node embeddings. We show that SELENE's components improve the quality of node embeddings, facilitating the discrimination of connected heterophilous nodes. Comprehensive empirical evaluations on both synthetic and real-world datasets with varying homophily ratios validate the effectiveness of SELENE in homophilous and heterophilous settings showing an up to 12.52% clustering accuracy gain. △ Less

Submitted 22 December, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

Comments: Accepted to Transactions on Machine Learning Research

arXiv:2203.03812 [pdf, other]

SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech

Authors: Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du

Abstract: Transformer has obtained promising results on cognitive speech signal processing field, which is of interest in various applications ranging from emotion to neurocognitive disorder analysis. However, most works treat speech signal as a whole, leading to the neglect of the pronunciation structure that is unique to speech and reflects the cognitive process. Meanwhile, Transformer has heavy computati… ▽ More Transformer has obtained promising results on cognitive speech signal processing field, which is of interest in various applications ranging from emotion to neurocognitive disorder analysis. However, most works treat speech signal as a whole, leading to the neglect of the pronunciation structure that is unique to speech and reflects the cognitive process. Meanwhile, Transformer has heavy computational burden due to its full attention operation. In this paper, a hierarchical efficient framework, called SpeechFormer, which considers the structural characteristics of speech, is proposed and can be served as a general-purpose backbone for cognitive speech signal processing. The proposed SpeechFormer consists of frame, phoneme, word and utterance stages in succession, each performing a neighboring attention according to the structural pattern of speech with high computational efficiency. SpeechFormer is evaluated on speech emotion recognition (IEMOCAP & MELD) and neurocognitive disorder detection (Pitt & DAIC-WOZ) tasks, and the results show that SpeechFormer outperforms the standard Transformer-based framework while greatly reducing the computational cost. Furthermore, our SpeechFormer achieves comparable results to the state-of-the-art approaches. △ Less

Submitted 9 March, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

Comments: 5 pages, 4figures. This paper was submitted to Insterspeech 2022

arXiv:2203.00175 [pdf, other]

Nonconvex and Nonsmooth Approaches for Affine Chance-Constrained Stochastic Programs

Authors: Ying Cui, Junyi Liu, Jong-Shi Pang

Abstract: Chance-constrained programs (CCPs) constitute a difficult class of stochastic programs due to its possible nondifferentiability and nonconvexity even with simple linear random functionals. Existing approaches for solving the CCPs mainly deal with convex random functionals within the probability function. In the present paper, we consider two generalizations of the class of chance constraints commo… ▽ More Chance-constrained programs (CCPs) constitute a difficult class of stochastic programs due to its possible nondifferentiability and nonconvexity even with simple linear random functionals. Existing approaches for solving the CCPs mainly deal with convex random functionals within the probability function. In the present paper, we consider two generalizations of the class of chance constraints commonly studied in the literature; one generalization involves probabilities of disjunctive nonconvex functional events and the other generalization involves mixed-signed affine combinations of the resulting probabilities; together, we coin the term affine chance constraint (ACC) system for these generalized chance constraints. Our proposed treatment of such an ACC system involves the fusion of several individually known ideas: (a) parameterized upper and lower approximations of the indicator function in the expectation formulation of probability; (b) external (i.e., fixed) versus internal (i.e., sequential) sampling-based approximation of the expectation operator; (c) constraint penalization as relaxations of feasibility; and (d) convexification of nonconvexity and nondifferentiability via surrogation. The integration of these techniques for solving the affine chance-constrained stochastic program (ACC-SP) with various degrees of practicality and computational efforts is the main contribution of this paper. △ Less

Submitted 28 February, 2022; originally announced March 2022.

arXiv:2202.03472 [pdf, other]

New Bounds on the Size of Binary Codes with Large Minimum Distance

Authors: James Chin-Jen Pang, Hessam Mahdavifar, S. Sandeep Pradhan

Abstract: Let $A(n, d)$ denote the maximum size of a binary code of length $n$ and minimum Hamming distance $d$. Studying $A(n, d)$, including efforts to determine it as well to derive bounds on $A(n, d)$ for large $n$'s, is one of the most fundamental subjects in coding theory. In this paper, we explore new lower and upper bounds on $A(n, d)$ in the large-minimum distance regime, in particular, when… ▽ More Let $A(n, d)$ denote the maximum size of a binary code of length $n$ and minimum Hamming distance $d$. Studying $A(n, d)$, including efforts to determine it as well to derive bounds on $A(n, d)$ for large $n$'s, is one of the most fundamental subjects in coding theory. In this paper, we explore new lower and upper bounds on $A(n, d)$ in the large-minimum distance regime, in particular, when $d = n/2 - Ω(\sqrt{n})$. We first provide a new construction of cyclic codes, by carefully selecting specific roots in the binary extension field for the check polynomial, with length $n= 2^m -1$, distance $d \geq n/2 - 2^{c-1}\sqrt{n}$, and size $n^{c+1/2}$, for any $m\geq 4$ and any integer $c$ with $0 \leq c \leq m/2 - 1$. These code parameters are slightly worse than those of the Delsarte--Goethals (DG) codes that provide the previously known best lower bound in the large-minimum distance regime. However, using a similar and extended code construction technique we show a sequence of cyclic codes that improve upon DG codes and provide the best lower bound in a narrower range of the minimum distance $d$, in particular, when $d = n/2 - Ω(n^{2/3})$. Furthermore, by leveraging a Fourier-analytic view of Delsarte's linear program, upper bounds on $A(n, n/2 - ρ\sqrt{n})$ with $ρ\in (0.5, 9.5)$ are obtained that scale polynomially in $n$. To the best of authors' knowledge, the upper bound due to Barg and Nogin \cite{barg2006spectral} is the only previously known upper bound that scale polynomially in $n$ in this regime. We numerically demonstrate that our upper bound improves upon the Barg-Nogin upper bound in the specified high-minimum distance regime. △ Less

Submitted 23 May, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: 23 pages

arXiv:2112.12421 [pdf, other]

Multiphysics mixed finite element method with Nitsche's technique for Stokes poroelasticity problem

Authors: Zhihao Ge, **'ge Pang, Jiwei Cao

Abstract: In this paper, we propose a multiphysics mixed finite element method with Nitsche's technique for Stokes-poroelasticity problem. Firstly, we present a multiphysics reformulation of poroelasticity part of the original problem by introducing two pseudo-pressures to reveal the underlying deformation and diffusion multi physical processes in the Stokes-poroelasticity problem. Then, we prove the existe… ▽ More In this paper, we propose a multiphysics mixed finite element method with Nitsche's technique for Stokes-poroelasticity problem. Firstly, we present a multiphysics reformulation of poroelasticity part of the original problem by introducing two pseudo-pressures to reveal the underlying deformation and diffusion multi physical processes in the Stokes-poroelasticity problem. Then, we prove the existence and uniqueness of weak solution of the reformulated and original problem. And we use Nitsche's technique to approximate the coupling condition at the interface to propose a loosely-coupled time-step** method -- multiphysics mixed finite element method for space variables, and we decouple the reformulated problem into three sub-problems at each time step -- a Stokes problem, a generalized Stokes problem and a mixed diffusion problem. Also, we give the stability analysis and error estimates of the loosely-coupled time-step** method. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: 37 pages, 11 figures. arXiv admin note: text overlap with arXiv:1403.5707 by other authors

MSC Class: 65N30 ACM Class: G.1.8

arXiv:2112.03886 [pdf, other]

Some Strongly Polynomially Solvable Convex Quadratic Programs with Bounded Variables

Authors: Jong-Shi Pang, Shaoning Han

Abstract: This paper begins with a class of convex quadratic programs (QPs) with bounded variables solvable by the parametric principal pivoting algorithm with $\mathcal{O}(n^3)$ strongly polynomial complexity, where $n$ is the number of variables of the problem. Extension of the Hessian class is also discussed. Our research is motivated by a recent reference [7] wherein the efficient solution of a quadrati… ▽ More This paper begins with a class of convex quadratic programs (QPs) with bounded variables solvable by the parametric principal pivoting algorithm with $\mathcal{O}(n^3)$ strongly polynomial complexity, where $n$ is the number of variables of the problem. Extension of the Hessian class is also discussed. Our research is motivated by a recent reference [7] wherein the efficient solution of a quadratic program with a tridiagonal Hessian matrix in the quadratic objective is needed for the construction of a polynomial-time algorithm for solving an associated sparse variable selection problem. With the tridiagonal structure, the complexity of the QP algorithm reduces to $\mathcal{O}(n^2)$. Our strongly polynomiality results extend previous works of some strongly polynomially solvable linear complementarity problems with a P-matrix [9]; special cases of the extended results include weakly quasi-diagonally dominant problems in addition to the tridiagonal ones. △ Less

Submitted 27 September, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

arXiv:2111.04946 [pdf, other]

doi 10.1109/TIP.2022.3214077

Graph-Based Depth Denoising & Dequantization for Point Cloud Enhancement

Authors: Xue Zhang, Gene Cheung, Jiahao Pang, Yash Sanghvi, Abhiram Gnanasambandam, Stanley H. Chan

Abstract: A 3D point cloud is typically constructed from depth measurements acquired by sensors at one or more viewpoints. The measurements suffer from both quantization and noise corruption. To improve quality, previous works denoise a point cloud \textit{a posteriori} after projecting the imperfect depth data onto 3D space. Instead, we enhance depth measurements directly on the sensed images \textit{a pri… ▽ More A 3D point cloud is typically constructed from depth measurements acquired by sensors at one or more viewpoints. The measurements suffer from both quantization and noise corruption. To improve quality, previous works denoise a point cloud \textit{a posteriori} after projecting the imperfect depth data onto 3D space. Instead, we enhance depth measurements directly on the sensed images \textit{a priori}, before synthesizing a 3D point cloud. By enhancing near the physical sensing process, we tailor our optimization to our depth formation model before subsequent processing steps that obscure measurement errors. Specifically, we model depth formation as a combined process of signal-dependent noise addition and non-uniform log-based quantization. The designed model is validated (with parameters fitted) using collected empirical data from a representative depth sensor. To enhance each pixel row in a depth image, we first encode intra-view similarities between available row pixels as edge weights via feature graph learning. We next establish inter-view similarities with another rectified depth image via viewpoint map** and sparse linear interpolation. This leads to a maximum a posteriori (MAP) graph filtering objective that is convex and differentiable. We minimize the objective efficiently using accelerated gradient descent (AGD), where the optimal step size is approximated via Gershgorin circle theorem (GCT). Experiments show that our method significantly outperformed recent point cloud denoising schemes and state-of-the-art image denoising schemes in two established point cloud quality metrics. △ Less

Submitted 6 October, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

Comments: 16 pages,14 figures

arXiv:2110.09351 [pdf, other]

doi 10.1007/JHEP02(2022)158

Relativistic-invariant formulation of the NREFT three-particle quantization condition

Authors: Fabian Müller, **-Yi Pang, Akaki Rusetsky, Jia-Jun Wu

Abstract: A three-particle quantization condition on the lattice is written down in a manifestly relativistic-invariant form by using a generalization of the non-relativistic effective field theory (NREFT) approach. Inclusion of the higher partial waves is explicitly addressed. A partial diagonalization of the quantization condition into the various irreducible representations of the (little groups of the)… ▽ More A three-particle quantization condition on the lattice is written down in a manifestly relativistic-invariant form by using a generalization of the non-relativistic effective field theory (NREFT) approach. Inclusion of the higher partial waves is explicitly addressed. A partial diagonalization of the quantization condition into the various irreducible representations of the (little groups of the) octahedral group has been carried out both in the center-of-mass frame and in moving frames. Furthermore, producing synthetic data in a toy model, the relativistic invariance is explicitly demonstrated for the three-body bound state spectrum. △ Less

Submitted 10 February, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

Comments: 38 pages, 7 figures

arXiv:2110.02766 [pdf, other]

doi 10.1103/PhysRevD.105.L071502

Generalization of Weinberg's Compositeness Relations

Authors: Yan Li, Feng-Kun Guo, **-Yi Pang, Jia-Jun Wu

Abstract: We generalize the time-honored Weinberg's compositeness relations by including the range corrections through considering a general form factor. In Weinberg's derivation, he considered the effective range expansion up to $\mathcal{O}(p^2)$ and made two additional approximations: neglecting the non-pole term in the Low equation; approximating the form factor by a constant. We lift the second approxi… ▽ More We generalize the time-honored Weinberg's compositeness relations by including the range corrections through considering a general form factor. In Weinberg's derivation, he considered the effective range expansion up to $\mathcal{O}(p^2)$ and made two additional approximations: neglecting the non-pole term in the Low equation; approximating the form factor by a constant. We lift the second approximation, and work out an analytic expression for the form factor. For a positive effective range, the form factor is of a single-pole form. An integral representation of the compositeness is obtained and is expected to have a smaller uncertainty than that derived from Weinberg's relations. We also establish an exact relation between the wave function of a bound state and the phase of the scattering amplitude neglecting the non-pole term. The deuteron is analyzed as an example, and the formalism can be applied to other cases where range corrections are important. △ Less

Submitted 23 April, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

Comments: 9 pages, 2 figures

Journal ref: Phys. Rev. D 105, 071502 (2022)

arXiv:2109.14539

New Solution based on Hodge Decomposition for Abstract Games

Authors: Yihao Luo, **hui Pang, Weibin Han, Huafei Sun

Abstract: This paper proposes Hodge Potential Choice (HPC), a new solution for abstract games with irreflexive dominance relations. This solution is formulated by involving geometric tools like differential forms and Hodge decomposition onto abstract games. We provide a workable algorithm for the proposed solution with a new data structure of abstract games. From the view of gaming, HPC overcomes several we… ▽ More This paper proposes Hodge Potential Choice (HPC), a new solution for abstract games with irreflexive dominance relations. This solution is formulated by involving geometric tools like differential forms and Hodge decomposition onto abstract games. We provide a workable algorithm for the proposed solution with a new data structure of abstract games. From the view of gaming, HPC overcomes several weaknesses of conventional solutions. HPC coincides with Copeland Choice in complete cases and can be extended to slove games with marginal strengths. It will be proven that the Hodge potential choice possesses three prevalent axiomatic properties: neutrality, strong monotonicity, dominance cycle s reversing independence, and sensitivity to mutual dominance. To compare the HPC with Copeland Choice in large samples of games, we design digital experiments with randomly generated abstract games with different sizes and completeness. The experimental results present the advantage of HPC in the statistical sense. △ Less

Submitted 26 January, 2024; v1 submitted 29 September, 2021; originally announced September 2021.

Comments: Need further polishing and add new sections

MSC Class: 91-08

arXiv:2108.10175 [pdf, other]

doi 10.1007/s11263-021-01434-2

Towards Balanced Learning for Instance Recognition

Authors: Jiangmiao Pang, Kai Chen, Qi Li, Zhihai Xu, Huajun Feng, Jian** Shi, Wanli Ouyang, Dahua Lin

Abstract: Instance recognition is rapidly advanced along with the developments of various deep convolutional neural networks. Compared to the architectures of networks, the training process, which is also crucial to the success of detectors, has received relatively less attention. In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often… ▽ More Instance recognition is rapidly advanced along with the developments of various deep convolutional neural networks. Compared to the architectures of networks, the training process, which is also crucial to the success of detectors, has received relatively less attention. In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level. To mitigate the adverse effects caused thereby, we propose Libra R-CNN, a simple yet effective framework towards balanced learning for instance recognition. It integrates IoU-balanced sampling, balanced feature pyramid, and objective re-weighting, respectively for reducing the imbalance at sample, feature, and objective level. Extensive experiments conducted on MS COCO, LVIS and Pascal VOC datasets prove the effectiveness of the overall balanced design. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: Accepted by IJCV. Journal extension of paper arXiv:1904.02701

arXiv:2108.03557 [pdf, other]

doi 10.1109/TCSVT.2022.3206476

Context-Aware Mixup for Domain Adaptive Semantic Segmentation

Authors: Qianyu Zhou, Zhengyang Feng, Qiqi Gu, Jiangmiao Pang, Guangliang Cheng, Xuequan Lu, Jian** Shi, Lizhuang Ma

Abstract: Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source domain to an unlabeled target domain. Existing UDA-based semantic segmentation approaches always reduce the domain shifts in pixel level, feature level, and output level. However, almost all of them largely neglect the contextual dependency, which is generally shared across different domains, leading to less-desired p… ▽ More Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source domain to an unlabeled target domain. Existing UDA-based semantic segmentation approaches always reduce the domain shifts in pixel level, feature level, and output level. However, almost all of them largely neglect the contextual dependency, which is generally shared across different domains, leading to less-desired performance. In this paper, we propose a novel Context-Aware Mixup (CAMix) framework for domain adaptive semantic segmentation, which exploits this important clue of context-dependency as explicit prior knowledge in a fully end-to-end trainable manner for enhancing the adaptability toward the target domain. Firstly, we present a contextual mask generation strategy by leveraging the accumulated spatial distributions and prior contextual relationships. The generated contextual mask is critical in this work and will guide the context-aware domain mixup on three different levels. Besides, provided the context knowledge, we introduce a significance-reweighted consistency loss to penalize the inconsistency between the mixed student prediction and the mixed teacher prediction, which alleviates the negative transfer of the adaptation, e.g., early performance degradation. Extensive experiments and analysis demonstrate the effectiveness of our method against the state-of-the-art approaches on widely-used UDA benchmarks. △ Less

Submitted 11 September, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

Comments: Accepted to IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

arXiv:2108.03553 [pdf, other]

Self-Adversarial Disentangling for Specific Domain Adaptation

Authors: Qianyu Zhou, Qiqi Gu, Jiangmiao Pang, Xuequan Lu, Lizhuang Ma

Abstract: Domain adaptation aims to bridge the domain shifts between the source and the target domain. These shifts may span different dimensions such as fog, rainfall, etc. However, recent methods typically do not consider explicit prior knowledge about the domain shifts on a specific dimension, thus leading to less desired adaptation performance. In this paper, we study a practical setting called Specific… ▽ More Domain adaptation aims to bridge the domain shifts between the source and the target domain. These shifts may span different dimensions such as fog, rainfall, etc. However, recent methods typically do not consider explicit prior knowledge about the domain shifts on a specific dimension, thus leading to less desired adaptation performance. In this paper, we study a practical setting called Specific Domain Adaptation (SDA) that aligns the source and target domains in a demanded-specific dimension. Within this setting, we observe the intra-domain gap induced by different domainness (i.e., numerical magnitudes of domain shifts in this dimension) is crucial when adapting to a specific domain. To address the problem, we propose a novel Self-Adversarial Disentangling (SAD) framework. In particular, given a specific dimension, we first enrich the source domain by introducing a domainness creator with providing additional supervisory signals. Guided by the created domainness, we design a self-adversarial regularizer and two loss functions to jointly disentangle the latent representations into domainness-specific and domainness-invariant features, thus mitigating the intra-domain gap. Our method can be easily taken as a plug-and-play framework and does not introduce any extra costs in the inference time. We achieve consistent improvements over state-of-the-art methods in both object detection and semantic segmentation. △ Less

Submitted 5 December, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

arXiv:2107.14160 [pdf, other]

Probabilistic and Geometric Depth: Detecting Objects in Perspective

Authors: Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

Abstract: 3D object detection is an important capability needed in various practical applications such as driver assistance systems. Monocular 3D detection, as a representative general setting among image-based approaches, provides a more economical solution than conventional settings relying on LiDARs but still yields unsatisfactory results. This paper first presents a systematic study on this problem. We… ▽ More 3D object detection is an important capability needed in various practical applications such as driver assistance systems. Monocular 3D detection, as a representative general setting among image-based approaches, provides a more economical solution than conventional settings relying on LiDARs but still yields unsatisfactory results. This paper first presents a systematic study on this problem. We observe that the current monocular 3D detection can be simplified as an instance depth estimation problem: The inaccurate instance depth blocks all the other 3D attribute predictions from improving the overall detection performance. Moreover, recent methods directly estimate the depth based on isolated instances or pixels while ignoring the geometric relations across different objects. To this end, we construct geometric relation graphs across predicted objects and use the graph to facilitate depth estimation. As the preliminary depth estimation of each instance is usually inaccurate in this ill-posed setting, we incorporate a probabilistic representation to capture the uncertainty. It provides an important indicator to identify confident predictions and further guide the depth propagation. Despite the simplicity of the basic idea, our method, PGD, obtains significant improvements on KITTI and nuScenes benchmarks, achieving 1st place out of all monocular vision-only methods while still maintaining real-time efficiency. Code and models will be released at https://github.com/open-mmlab/mmdetection3d. △ Less

Submitted 25 November, 2021; v1 submitted 29 July, 2021; originally announced July 2021.

Comments: Conference on Robot Learning (CoRL) 2021

arXiv:2106.14855 [pdf, other]

K-Net: Towards Unified Image Segmentation

Authors: Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

Abstract: Semantic, instance, and panoptic segmentations have been addressed using different and specialized frameworks despite their underlying connections. This paper presents a unified, simple, and effective framework for these essentially similar tasks. The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsib… ▽ More Semantic, instance, and panoptic segmentations have been addressed using different and specialized frameworks despite their underlying connections. This paper presents a unified, simple, and effective framework for these essentially similar tasks. The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class. To remedy the difficulties of distinguishing various instances, we propose a kernel update strategy that enables each kernel dynamic and conditional on its meaningful group in the input image. K-Net can be trained in an end-to-end manner with bipartite matching, and its training and inference are naturally NMS-free and box-free. Without bells and whistles, K-Net surpasses all previous published state-of-the-art single-model results of panoptic segmentation on MS COCO test-dev split and semantic segmentation on ADE20K val split with 55.2% PQ and 54.3% mIoU, respectively. Its instance segmentation performance is also on par with Cascade Mask R-CNN on MS COCO with 60%-90% faster inference speeds. Code and models will be released at https://github.com/ZwwWayne/K-Net/. △ Less

Submitted 1 November, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

Comments: Camera ready for NeurIPS2021

arXiv:2106.14192 [pdf, other]

Disentangling semantic features of macromolecules in Cryo-Electron Tomography

Authors: Kai Yi, Jianye Pang, Yungeng Zhang, Xiangrui Zeng, Min Xu

Abstract: Cryo-electron tomography (Cryo-ET) is a 3D imaging technique that enables the systemic study of shape, abundance, and distribution of macromolecular structures in single cells in near-atomic resolution. However, the systematic and efficient $\textit{de novo}$ recognition and recovery of macromolecular structures captured by Cryo-ET are very challenging due to the structural complexity and imaging… ▽ More Cryo-electron tomography (Cryo-ET) is a 3D imaging technique that enables the systemic study of shape, abundance, and distribution of macromolecular structures in single cells in near-atomic resolution. However, the systematic and efficient $\textit{de novo}$ recognition and recovery of macromolecular structures captured by Cryo-ET are very challenging due to the structural complexity and imaging limits. Even macromolecules with identical structures have various appearances due to different orientations and imaging limits, such as noise and the missing wedge effect. Explicitly disentangling the semantic features of macromolecules is crucial for performing several downstream analyses on the macromolecules. This paper has addressed the problem by proposing a 3D Spatial Variational Autoencoder that explicitly disentangle the structure, orientation, and shift of macromolecules. Extensive experiments on both synthesized and real cryo-ET datasets and cross-domain evaluations demonstrate the efficacy of our method. △ Less

Submitted 27 June, 2021; originally announced June 2021.

arXiv:2106.11532 [pdf, other]

doi 10.1109/ICASSP43922.2022.9746598

Key-Sparse Transformer for Multimodal Speech Emotion Recognition

Authors: Weidong Chen, Xiaofeng Xing, Xiangmin Xu, Jichen Yang, Jianxin Pang

Abstract: Speech emotion recognition is a challenging research topic that plays a critical role in human-computer interaction. Multimodal inputs further improve the performance as more emotional information is used. However, existing studies learn all the information in the sample while only a small portion of it is about emotion. The redundant information will become noises and limit the system performance… ▽ More Speech emotion recognition is a challenging research topic that plays a critical role in human-computer interaction. Multimodal inputs further improve the performance as more emotional information is used. However, existing studies learn all the information in the sample while only a small portion of it is about emotion. The redundant information will become noises and limit the system performance. In this paper, a key-sparse Transformer is proposed for efficient emotion recognition by focusing more on emotion related information. The proposed method is evaluated on the IEMOCAP and LSSED. Experimental results show that the proposed method achieves better performance than the state-of-the-art approaches. △ Less

Submitted 27 February, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

Comments: This paper was accepted by ICASSP 2022

arXiv:2105.08666 [pdf, other]

Reinforcement Learning With Sparse-Executing Actions via Sparsity Regularization

Authors: **g-Cheng Pang, Tian Xu, Shengyi Jiang, Yu-Ren Liu, Yang Yu

Abstract: Reinforcement learning (RL) has made remarkable progress in many decision-making tasks, such as Go, game playing, and robotics control. However, classic RL approaches often presume that all actions can be executed an infinite number of times, which is inconsistent with many decision-making scenarios in which actions have limited budgets or execution opportunities. Imagine an agent playing a gunfig… ▽ More Reinforcement learning (RL) has made remarkable progress in many decision-making tasks, such as Go, game playing, and robotics control. However, classic RL approaches often presume that all actions can be executed an infinite number of times, which is inconsistent with many decision-making scenarios in which actions have limited budgets or execution opportunities. Imagine an agent playing a gunfighting game with limited ammunition. It only fires when the enemy appears in the correct position, making shooting a sparse-executing action. Such sparse-executing action has not been considered by classic RL algorithms in problem formulation or effective algorithms design. This paper attempts to address sparse-executing action issues by first formalizing the problem as a Sparse Action Markov Decision Process (SA-MDP), in which certain actions in the action space can only be executed for limited amounts of time. Then, we propose a policy optimization algorithm called Action Sparsity REgularization (ASRE) that gives each action a distinct preference. ASRE evaluates action sparsity through constrained action sampling and regularizes policy training based on the evaluated action sparsity, represented by action distribution. Experiments on tasks with known sparse-executing actions, where classical RL algorithms struggle to train policy efficiently, ASRE effectively constrains the action sampling and outperforms baselines. Moreover, we present that ASRE can generally improve the performance in Atari games, demonstrating its broad applicability △ Less

Submitted 5 January, 2023; v1 submitted 18 May, 2021; originally announced May 2021.

Comments: 12 pages, 10 figures

arXiv:2105.07253 [pdf, other]

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

Authors: Xu-Hui Liu, Zhenghai Xue, **g-Cheng Pang, Shengyi Jiang, Feng Xu, Yang Yu

Abstract: In reinforcement learning, experience replay stores past samples for further reuse. Prioritized sampling is a promising technique to better utilize these samples. Previous criteria of prioritization include TD error, recentness and corrective feedback, which are mostly heuristically designed. In this work, we start from the regret minimization objective, and obtain an optimal prioritization strate… ▽ More In reinforcement learning, experience replay stores past samples for further reuse. Prioritized sampling is a promising technique to better utilize these samples. Previous criteria of prioritization include TD error, recentness and corrective feedback, which are mostly heuristically designed. In this work, we start from the regret minimization objective, and obtain an optimal prioritization strategy for Bellman update that can directly maximize the return of the policy. The theory suggests that data with higher hindsight TD error, better on-policiness and more accurate Q value should be assigned with higher weights during sampling. Thus most previous criteria only consider this strategy partially. We not only provide theoretical justifications for previous criteria, but also propose two new methods to compute the prioritization weight, namely ReMERN and ReMERT. ReMERN learns an error network, while ReMERT exploits the temporal ordering of states. Both methods outperform previous prioritized sampling algorithms in challenging RL benchmarks, including MuJoCo, Atari and Meta-World. △ Less

Submitted 9 November, 2021; v1 submitted 15 May, 2021; originally announced May 2021.

arXiv:2104.11004 [pdf, other]

Hazy Re-ID: An Interference Suppression Model For Domain Adaptation Person Re-identification Under Inclement Weather Condition

Authors: Jian Pang, Dacheng Zhang, Huafeng Li, Weifeng Liu, Zhengtao Yu

Abstract: In a conventional domain adaptation person Re-identification (Re-ID) task, both the training and test images in target domain are collected under the sunny weather. However, in reality, the pedestrians to be retrieved may be obtained under severe weather conditions such as hazy, dusty and snowing, etc. This paper proposes a novel Interference Suppression Model (ISM) to deal with the interference c… ▽ More In a conventional domain adaptation person Re-identification (Re-ID) task, both the training and test images in target domain are collected under the sunny weather. However, in reality, the pedestrians to be retrieved may be obtained under severe weather conditions such as hazy, dusty and snowing, etc. This paper proposes a novel Interference Suppression Model (ISM) to deal with the interference caused by the hazy weather in domain adaptation person Re-ID. A teacherstudent model is used in the ISM to distill the interference information at the feature level by reducing the discrepancy between the clear and the hazy intrinsic similarity matrix. Furthermore, in the distribution level, the extra discriminator is introduced to assist the student model make the interference feature distribution more clear. The experimental results show that the proposed method achieves the superior performance on two synthetic datasets than the stateof-the-art methods. The related code will be released online https://github.com/pangjian123/ISM-ReID. △ Less

Submitted 22 April, 2021; originally announced April 2021.

Comments: Accepted by ICME2021 as oral

arXiv:2104.10956 [pdf, other]

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

Authors: Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

Abstract: Monocular 3D object detection is an important task for autonomous driving considering its advantage of low cost. It is much more challenging than conventional 2D cases due to its inherent ill-posed property, which is mainly reflected in the lack of depth information. Recent progress on 2D detection offers opportunities to better solving this problem. However, it is non-trivial to make a general ad… ▽ More Monocular 3D object detection is an important task for autonomous driving considering its advantage of low cost. It is much more challenging than conventional 2D cases due to its inherent ill-posed property, which is mainly reflected in the lack of depth information. Recent progress on 2D detection offers opportunities to better solving this problem. However, it is non-trivial to make a general adapted 2D detector work in this 3D task. In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D. Specifically, we first transform the commonly defined 7-DoF 3D targets to the image domain and decouple them as 2D and 3D attributes. Then the objects are distributed to different feature levels with consideration of their 2D scales and assigned only according to the projected 3D-center for the training procedure. Furthermore, the center-ness is redefined with a 2D Gaussian distribution based on the 3D-center to fit the 3D target formulation. All of these make this framework simple yet effective, getting rid of any 2D detection or 2D-3D correspondence priors. Our solution achieves 1st place out of all the vision-only methods in the nuScenes 3D detection challenge of NeurIPS 2020. Code and models are released at https://github.com/open-mmlab/mmdetection3d. △ Less

Submitted 24 September, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

Comments: Camera-ready version of 3DODI workshop at ICCV 2021; Technical report for the best vision-only method (1st place of the camera track) in the nuScenes 3D detection challenge of NeurIPS 2020

arXiv:2104.04331 [pdf, other]

The Burden of Being a Bridge: Analysing Subjective Well-Being of Twitter Users during the COVID-19 Pandemic

Authors: Ninghan Chen, Xihui Chen, Zhiqiang Zhong, Jun Pang

Abstract: The outbreak of the COVID-19 pandemic triggers infodemic over online social media, which significantly impacts public health around the world, both physically and psychologically. In this paper, we study the impact of the pandemic on the mental health of influential social media users, whose sharing behaviours significantly promote the diffusion of COVID-19 related information. Specifically, we fo… ▽ More The outbreak of the COVID-19 pandemic triggers infodemic over online social media, which significantly impacts public health around the world, both physically and psychologically. In this paper, we study the impact of the pandemic on the mental health of influential social media users, whose sharing behaviours significantly promote the diffusion of COVID-19 related information. Specifically, we focus on subjective well-being (SWB), and analyse whether SWB changes have a relationship with their bridging performance in information diffusion, which measures the speed and wideness gain of information transmission due to their sharing. We accurately capture users' bridging performance by proposing a new measurement. Benefiting from deep-learning natural language processing models, we quantify social media users' SWB from their textual posts. With the data collected from Twitter for almost two years, we reveal the greater mental suffering of influential users during the COVID-19 pandemic. Through comprehensive hierarchical multiple regression analysis, we are the first to discover the strong {relationship} between social users' SWB and their bridging performance. △ Less

Submitted 26 June, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

arXiv:2104.00798 [pdf, other]

FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds

Authors: Haiyan Wang, Jiahao Pang, Muhammad A. Lodhi, Yingli Tian, Dong Tian

Abstract: Scene flow depicts the dynamics of a 3D scene, which is critical for various applications such as autonomous driving, robot navigation, AR/VR, etc. Conventionally, scene flow is estimated from dense/regular RGB video frames. With the development of depth-sensing technologies, precise 3D measurements are available via point clouds which have sparked new research in 3D scene flow. Nevertheless, it r… ▽ More Scene flow depicts the dynamics of a 3D scene, which is critical for various applications such as autonomous driving, robot navigation, AR/VR, etc. Conventionally, scene flow is estimated from dense/regular RGB video frames. With the development of depth-sensing technologies, precise 3D measurements are available via point clouds which have sparked new research in 3D scene flow. Nevertheless, it remains challenging to extract scene flow from point clouds due to the sparsity and irregularity in typical point cloud sampling patterns. One major issue related to irregular sampling is identified as the randomness during point set abstraction/feature extraction -- an elementary process in many flow estimation scenarios. A novel Spatial Abstraction with Attention (SA^2) layer is accordingly proposed to alleviate the unstable abstraction problem. Moreover, a Temporal Abstraction with Attention (TA^2) layer is proposed to rectify attention in temporal domain, leading to benefits with motions scaled in a larger range. Extensive analysis and experiments verified the motivation and significant performance gains of our method, dubbed as Flow Estimation via Spatial-Temporal Attention (FESTA), when compared to several state-of-the-art benchmarks of scene flow estimation. △ Less

Submitted 6 December, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

Comments: Accepted at CVPR 2021 (Oral Presentation)

arXiv:2103.10141 [pdf]

In Operando magnetometry study on the charge storage mechanism of SnCo alloy lithium ion batteries

Authors: Qingtao Xia, Xiangkun Li, Kai Wang, Zhaohui Li, Hengjun Liu, Xia Wang, Wanneng Ye, Hongsen Li, Han Hu, **bo Pang, Qinghua Zhang, Chen Ge, Shandong Li, Lin Gu, Guoxing Miao, Shishen Yan, Qiang Li

Abstract: In view of the long-standing controversy over the reversibility of transition metals in Sn-based alloys as anode for Li-ion batteries, an in situ real-time magnetic monitoring method was used to investigate the evolution of Sn-Co intermetallic during the electrochemical cycling. Sn-Co alloy film anodes with different compositions were prepared via magnetron sputtering without using binders and con… ▽ More In view of the long-standing controversy over the reversibility of transition metals in Sn-based alloys as anode for Li-ion batteries, an in situ real-time magnetic monitoring method was used to investigate the evolution of Sn-Co intermetallic during the electrochemical cycling. Sn-Co alloy film anodes with different compositions were prepared via magnetron sputtering without using binders and conductive additives. The magnetic responses showed that the Co particles liberated by Li insertion recombine fully with Sn during the delithiation to reform Sn-Co intermetallic into stannum richer phases Sn7Co3. However, as the Co content increases, it can only recombine partially with Sn into cobalt richer phases Sn3Co7. The unconverted Co particles may form a dense barrier layer and prevent the full reaction of Li with all the Sn in the anode, leading to lower capacities. These critical results shed light on understanding the reaction mechanism of transition metals, and provide valuable insights toward the design of high-performance Sn alloy based anodes. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: 21 pages, 15 figures

arXiv:2101.02267 [pdf]

doi 10.1016/j.eurpolymj.2020.110187

Hydrogen phosphate-mediated acellular biomineralisation within a dual crosslinked hyaluronic acid hydrogel

Authors: Ziyu Gao, Layla Hassouneh, Xuebin Yang, Juan Pang, Paul D. Thornton, Giuseppe Tronci

Abstract: The creation of hyaluronic acid (HA)-based materials as biomineralisation scaffolds for cost-effective hard tissue regenerative therapies remains a key biomedical challenge. A non-toxic and simple acellular method to generate specific hydrogen phosphate interactions within the polymer network of cystamine-crosslinked HA hydrogels is reported. Reinforced dual crosslinked hydrogel networks were acco… ▽ More The creation of hyaluronic acid (HA)-based materials as biomineralisation scaffolds for cost-effective hard tissue regenerative therapies remains a key biomedical challenge. A non-toxic and simple acellular method to generate specific hydrogen phosphate interactions within the polymer network of cystamine-crosslinked HA hydrogels is reported. Reinforced dual crosslinked hydrogel networks were accomplished after 4-week incubation in disodium phosphate-supplemented solutions that notably enabled the mineralisation of hydroxyapatite (HAp) crystals across the entire hydrogel structure. Hydrogen phosphate-cystamine crosslinked HA hydrogen bond interactions were confirmed by attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR) and density functional theory (DFT) calculations. Hydrogen phosphate-mediated physical crosslinks proved to serve as a first nucleation step for acellular hydrogel mineralisation in simulated body fluid allowing HAp crystals to be detected by X-ray powder diffraction (2θ = 27°, 33° and 35°) and visualised with density gradient across the entire hydrogel network. On a cellular level, the presence of aggregated structures proved key to inducing ATDC 5 cell migration whilst no toxic response was observed after 3-week culture. This mild and facile ion-mediated stabilisation of HA-based hydrogels has significant potential for accelerated hard tissue repair in vivo and provides a new perspective in the design of dual crosslinked mechanically competent hydrogels. △ Less

Submitted 6 January, 2021; originally announced January 2021.

arXiv:2101.02069 [pdf, other]

Model Extraction and Defenses on Generative Adversarial Networks

Authors: Hailong Hu, Jun Pang

Abstract: Model extraction attacks aim to duplicate a machine learning model through query access to a target model. Early studies mainly focus on discriminative models. Despite the success, model extraction attacks against generative models are less well explored. In this paper, we systematically study the feasibility of model extraction attacks against generative adversarial networks (GANs). Specifically,… ▽ More Model extraction attacks aim to duplicate a machine learning model through query access to a target model. Early studies mainly focus on discriminative models. Despite the success, model extraction attacks against generative models are less well explored. In this paper, we systematically study the feasibility of model extraction attacks against generative adversarial networks (GANs). Specifically, we first define accuracy and fidelity on model extraction attacks against GANs. Then we study model extraction attacks against GANs from the perspective of accuracy extraction and fidelity extraction, according to the adversary's goals and background knowledge. We further conduct a case study where an adversary can transfer knowledge of the extracted model which steals a state-of-the-art GAN trained with more than 3 million images to new domains to broaden the scope of applications of model extraction attacks. Finally, we propose effective defense techniques to safeguard GANs, considering a trade-off between the utility and security of GAN models. △ Less

Submitted 6 January, 2021; originally announced January 2021.

arXiv:2101.00644 [pdf, other]

Target Control of Asynchronous Boolean Networks

Authors: Cui Su, Jun Pang

Abstract: We study the target control of asynchronous Boolean networks, to identify efficacious interventions that can drive the dynamics of a given Boolean network from any initial state to the desired target attractor. Based on the application time, the control can be realised with three types of perturbations, including instantaneous, temporary and permanent perturbations. We develop efficient methods to… ▽ More We study the target control of asynchronous Boolean networks, to identify efficacious interventions that can drive the dynamics of a given Boolean network from any initial state to the desired target attractor. Based on the application time, the control can be realised with three types of perturbations, including instantaneous, temporary and permanent perturbations. We develop efficient methods to compute the target control for a given target attractor with three types of perturbations. We compare our methods with the stable motif-based control on a variety of real-life biological networks to evaluate their performance. We show that our methods scale well for large Boolean networks and they are able to identify a rich set of solutions with a small number of perturbations. △ Less

Submitted 3 January, 2021; originally announced January 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2006.02304

arXiv:2012.14578 [pdf]

Real-time Whole-body Obstacle Avoidance for 7-DOF Redundant Manipulators

Authors: Dake Zheng, Xinyu Wu, Jianxin Pang

Abstract: Mainly because of the heavy computational costs, the real-time whole-body obstacle avoidance for the redundant manipulators has not been well implemented. This paper presents an approach that can ensure that the whole-body of a redundant manipulator can avoid moving obstacles in real-time during the execution of a task. The manipulator is divided into end-effector and non-end-effector portion. Bas… ▽ More Mainly because of the heavy computational costs, the real-time whole-body obstacle avoidance for the redundant manipulators has not been well implemented. This paper presents an approach that can ensure that the whole-body of a redundant manipulator can avoid moving obstacles in real-time during the execution of a task. The manipulator is divided into end-effector and non-end-effector portion. Based on dynamical systems (DS), the real-time end-effector obstacle avoidance is obtained. Besides, the end-effector can reach the given target. By using null-space velocity control, the real-time non-endeffector obstacle avoidance is achieved. Finally, a controller is designed to ensure the whole-body obstacle avoidance. We validate the effectiveness of the method in the simulations and experiments on the 7-DOF arm of the UBTECH humanoid robot. △ Less

Submitted 28 December, 2020; originally announced December 2020.

arXiv:2012.14576 [pdf]

Dynamical Systems based Obstacle Avoidance with Workspace Constraint for Manipulators

Authors: Dake Zheng, Xinyu Wu, Jianxin Pang

Abstract: In this paper, based on Dynamical Systems (DS), we present an obstacle avoidance method that take into account workspace constraint for serial manipulators. Two modulation matrices that consider the effect of an obstacle and the workspace of a manipulator are determined when the obstacle does not intersect the workspace boundary and when the obstacle intersects the workspace boundary respectively.… ▽ More In this paper, based on Dynamical Systems (DS), we present an obstacle avoidance method that take into account workspace constraint for serial manipulators. Two modulation matrices that consider the effect of an obstacle and the workspace of a manipulator are determined when the obstacle does not intersect the workspace boundary and when the obstacle intersects the workspace boundary respectively. Using the modulation matrices, an original DS is deformed. The proposed approach can ensure that the trajectory of the manipulator computed according to the deformed DS neither penetrate the obstacle nor go out of the workspace. We validate the effectiveness of the approach in the simulations and experiments on the left arm of the UBTECH humanoid robot. △ Less

Submitted 28 December, 2020; originally announced December 2020.

arXiv:2012.13977 [pdf, other]

Capacity-achieving Polar-based LDGM Codes

Authors: James Chin-Jen Pang, Hessam Mahdavifar, S. Sandeep Pradhan

Abstract: In this paper, we study codes with sparse generator matrices. More specifically, low-density generator matrix (LDGM) codes with a certain constraint on the weight of the columns in the generator matrix are considered. In this paper, it is first shown that when a BMS channel W and a constant s>0 are given, there exists a polarization kernel such that the corresponding polar code is capacity-achievi… ▽ More In this paper, we study codes with sparse generator matrices. More specifically, low-density generator matrix (LDGM) codes with a certain constraint on the weight of the columns in the generator matrix are considered. In this paper, it is first shown that when a BMS channel W and a constant s>0 are given, there exists a polarization kernel such that the corresponding polar code is capacity-achieving and the column weights of the generator matrix (GM) are bounded from above by $N^s$. Then, a general construction based on a concatenation of polar codes and a rate-$1$ code, and a new column-splitting algorithm that guarantees a much sparser GM, is given. More specifically, for any BMS channel and any $ε> 2ε^*$, where $ε^* \approx 0.085$, an existence of a sequence of capacity-achieving codes with all the GM column weights upper bounded by $(\log N)^{1+ε}$ is shown. Furthermore, two coding schemes for BEC and BMS channels, based on a second column-splitting algorithm, are devised with low-complexity decoding that uses successive-cancellation. The second splitting algorithm allows for the use of a low-complexity decoder by preserving the reliability of the bit-channels observed by the source bits, and by increasing the code block length. The concatenation-based construction can also be applied to the random linear code ensemble to yield capacity-achieving codes with all the GM column weights being $O(\log N)$ and with (large-degree) polynomial decoding complexity. △ Less

Submitted 27 June, 2022; v1 submitted 27 December, 2020; originally announced December 2020.

Comments: Extended version, now includes moderate-block length comparison with the RLE. arXiv admin note: text overlap with arXiv:2001.11986

arXiv:2012.07088 [pdf, other]

From #Jobsearch to #Mask: Improving COVID-19 Cascade Prediction with Spillover Effects

Authors: Ninghan Chen, Xihui Chen, Zhiqiang Zhong, Jun Pang

Abstract: An information outbreak occurs on social media along with the COVID-19 pandemic and leads to infodemic. Predicting the popularity of online content, known as cascade prediction, allows for not only catching in advance hot information that deserves attention, but also identifying false information that will widely spread and require quick response to mitigate its impact. Among the various informati… ▽ More An information outbreak occurs on social media along with the COVID-19 pandemic and leads to infodemic. Predicting the popularity of online content, known as cascade prediction, allows for not only catching in advance hot information that deserves attention, but also identifying false information that will widely spread and require quick response to mitigate its impact. Among the various information diffusion patterns leveraged in previous works, the spillover effect of the information exposed to users on their decision to participate in diffusing certain information is still not studied. In this paper, we focus on the diffusion of information related to COVID-19 preventive measures. Through our collected Twitter dataset, we validated the existence of this spillover effect. Building on the finding, we proposed extensions to three cascade prediction methods based on Graph Neural Networks (GNNs). Experiments conducted on our dataset demonstrated that the use of the identified spillover effect significantly improves the state-of-the-art GNNs methods in predicting the popularity of not only preventive measure messages, but also other COVID-19 related messages. △ Less

Submitted 12 August, 2021; v1 submitted 13 December, 2020; originally announced December 2020.

arXiv:2011.02887 [pdf, other]

doi 10.1007/s11192-021-03984-1

Semantic and Relational Spaces in Science of Science: Deep Learning Models for Article Vectorisation

Authors: Diego Kozlowski, Jennifer Dusdal, Jun Pang, Andreas Zilian

Abstract: Over the last century, we observe a steady and exponentially growth of scientific publications globally. The overwhelming amount of available literature makes a holistic analysis of the research within a field and between fields based on manual inspection impossible. Automatic techniques to support the process of literature review are required to find the epistemic and social patterns that are emb… ▽ More Over the last century, we observe a steady and exponentially growth of scientific publications globally. The overwhelming amount of available literature makes a holistic analysis of the research within a field and between fields based on manual inspection impossible. Automatic techniques to support the process of literature review are required to find the epistemic and social patterns that are embedded in scientific publications. In computer sciences, new tools have been developed to deal with large volumes of data. In particular, deep learning techniques open the possibility of automated end-to-end models to project observations to a new, low-dimensional space where the most relevant information of each observation is highlighted. Using deep learning to build new representations of scientific publications is a growing but still emerging field of research. The aim of this paper is to discuss the potential and limits of deep learning for gathering insights about scientific research articles. We focus on document-level embeddings based on the semantic and relational aspects of articles, using Natural Language Processing (NLP) and Graph Neural Networks (GNNs). We explore the different outcomes generated by those techniques. Our results show that using NLP we can encode a semantic space of articles, while with GNN we are able to build a relational space where the social practices of a research community are also encoded. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Journal ref: Scientometrics 126, 2021

arXiv:2011.02883 [pdf, ps, other]

Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution

Authors: Junjie Pang, Jianbo Li, Zhenzhen Xie, Yan Huang, Zhipeng Cai

Abstract: In this work, we propose a collaborative city digital twin based on FL, a novel paradigm that allowing multiple city DT to share the local strategy and status in a timely manner. In particular, an FL central server manages the local updates of multiple collaborators (city DT), provides a global model which is trained in multiple iterations at different city DT systems, until the model gains the co… ▽ More In this work, we propose a collaborative city digital twin based on FL, a novel paradigm that allowing multiple city DT to share the local strategy and status in a timely manner. In particular, an FL central server manages the local updates of multiple collaborators (city DT), provides a global model which is trained in multiple iterations at different city DT systems, until the model gains the correlations between various response plan and infection trend. That means, a collaborative city DT paradigm based on FL techniques can obtain knowledge and patterns from multiple DTs, and eventually establish a `global view' for city crisis management. Meanwhile, it also helps to improve each city digital twin selves by consolidating other DT's respective data without violating privacy rules. To validate the proposed solution, we take COVID-19 pandemic as a case study. The experimental results on the real dataset with various response plan validate our proposed solution and demonstrate the superior performance. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Comments: 8 pages

MSC Class: none

arXiv:2010.13735 [pdf, other]

Personalised Meta-path Generation for Heterogeneous GNNs

Authors: Zhiqiang Zhong, Cheng-Te Li, Jun Pang

Abstract: Recently, increasing attention has been paid to heterogeneous graph representation learning (HGRL), which aims to embed rich structural and semantic information in heterogeneous information networks (HINs) into low-dimensional node representations. To date, most HGRL models rely on hand-crafted meta-paths. However, the dependency on manually-defined meta-paths requires domain knowledge, which is d… ▽ More Recently, increasing attention has been paid to heterogeneous graph representation learning (HGRL), which aims to embed rich structural and semantic information in heterogeneous information networks (HINs) into low-dimensional node representations. To date, most HGRL models rely on hand-crafted meta-paths. However, the dependency on manually-defined meta-paths requires domain knowledge, which is difficult to obtain for complex HINs. More importantly, the pre-defined or generated meta-paths of all existing HGRL methods attached to each node type or node pair cannot be personalised to each individual node. To fully unleash the power of HGRL, we present a novel framework, Personalised Meta-path based Heterogeneous Graph Neural Networks (PM-HGNN), to jointly generate meta-paths that are personalised for each individual node in a HIN and learn node representations for the target downstream task like node classification. Precisely, PM-HGNN treats the meta-path generation as a Markov Decision Process and utilises a policy network to adaptively generate a meta-path for each individual node and simultaneously learn effective node representations. The policy network is trained with deep reinforcement learning by exploiting the performance improvement on a downstream task. We further propose an extension, PM-HGNN++, to better encode relational structure and accelerate the training during the meta-path generation. Experimental results reveal that both PM-HGNN and PM-HGNN++ can significantly and consistently outperform 16 competing baselines and state-of-the-art methods in various settings of node classification. Qualitative analysis also shows that PM-HGNN++ can identify meaningful meta-paths overlooked by human knowledge. △ Less

Submitted 11 October, 2022; v1 submitted 26 October, 2020; originally announced October 2020.

arXiv:2010.00238 [pdf, other]

Multi-grained Semantics-aware Graph Neural Networks

Authors: Zhiqiang Zhong, Cheng-Te Li, Jun Pang

Abstract: Graph Neural Networks (GNNs) are powerful techniques in representation learning for graphs and have been increasingly deployed in a multitude of different applications that involve node- and graph-wise tasks. Most existing studies solve either the node-wise task or the graph-wise task independently while they are inherently correlated. This work proposes a unified model, AdamGNN, to interactively… ▽ More Graph Neural Networks (GNNs) are powerful techniques in representation learning for graphs and have been increasingly deployed in a multitude of different applications that involve node- and graph-wise tasks. Most existing studies solve either the node-wise task or the graph-wise task independently while they are inherently correlated. This work proposes a unified model, AdamGNN, to interactively learn node and graph representations in a mutual-optimisation manner. Compared with existing GNN models and graph pooling methods, AdamGNN enhances the node representation with the learned multi-grained semantics and avoids losing node features and graph structure information during pooling. Specifically, a differentiable pooling operator is proposed to adaptively generate a multi-grained structure that involves meso- and macro-level semantic information in the graph. We also devise the unpooling operator and the flyback aggregator in AdamGNN to better leverage the multi-grained semantics to enhance node representations. The updated node representations can further adjust the graph representation in the next iteration. Experiments on 14 real-world graph datasets show that AdamGNN can significantly outperform 17 competing models on both node- and graph-wise tasks. The ablation studies confirm the effectiveness of AdamGNN's components, and the last empirical analysis further reveals the ingenious ability of AdamGNN in capturing long-range interactions. △ Less

Submitted 18 March, 2022; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2009.11555 [pdf]

doi 10.31635/ccschem.021.202101341

Antenna enhanced infrared photoinduced force imaging in aqueous environment with super-resolution and hypersensitivity

Authors: Jian Li, Jie Pang, Zhen-dong Yan, Junghoon Jahng, ** Li, William Morrison, **g Liang, Qing-Ying Zhang, Xing-Hua Xia

Abstract: Tip enhanced IR spectra and imaging have been widely used in cutting-edge studies for the in-depth understanding of the composition, structure and function of interfaces at the nanoscale. However, molecular monolayer sensitivity has only been demonstrated on solid/gas interfaces. In aqueous environment, the reduced sensitivity due to strong dam** of the cantilever oscillation and background IR a… ▽ More Tip enhanced IR spectra and imaging have been widely used in cutting-edge studies for the in-depth understanding of the composition, structure and function of interfaces at the nanoscale. However, molecular monolayer sensitivity has only been demonstrated on solid/gas interfaces. In aqueous environment, the reduced sensitivity due to strong dam** of the cantilever oscillation and background IR absorption extremely limits the practical applications of tip enhanced IR nanospectroscopy. Here, we demonstrate hypersensitive nanoscale IR spectra and imaging in aqueous environment with the combination of photoinduced force (PiF) microscopy and resonant antennas. The highly confined electromagnetic field inbetween the tip end and antenna extremely amplifies the photoinduced force to the detectable level, while the excitation via plasmon internal reflection mode minimizes the environmental absorption. A polydimethylsiloxane (PDMS) layer (~1-2 nm thickness) functionalized on the AFM tip has been successfully identified in water with antennas of different sizes. Sampling volume of ~604 chemical bonds from PDMS was demonstrated with sub-10 nm spatial resolution confirmed by electric (E) field distribution map** on antennas, which strongly suggests the desired requirements for interfacial spectroscopy. This platform demonstrates for the first time the application of photoinduced force microscopy in aqueous environments, providing a brand-new configuration to achieve highly enhanced nanoscale IR signals, which is extremely promising for future research of interfaces and nanosystems in aqueous environments. △ Less

Submitted 29 August, 2021; v1 submitted 24 September, 2020; originally announced September 2020.

arXiv:2009.03717 [pdf, other]

Hierarchical Message-Passing Graph Neural Networks

Authors: Zhiqiang Zhong, Cheng-Te Li, Jun Pang

Abstract: Graph Neural Networks (GNNs) have become a prominent approach to machine learning with graphs and have been increasingly applied in a multitude of domains. Nevertheless, since most existing GNN models are based on flat message-passing mechanisms, two limitations need to be tackled: (i) they are costly in encoding long-range information spanning the graph structure; (ii) they are failing to encode… ▽ More Graph Neural Networks (GNNs) have become a prominent approach to machine learning with graphs and have been increasingly applied in a multitude of domains. Nevertheless, since most existing GNN models are based on flat message-passing mechanisms, two limitations need to be tackled: (i) they are costly in encoding long-range information spanning the graph structure; (ii) they are failing to encode features in the high-order neighbourhood in the graphs as they only perform information aggregation across the observed edges in the original graph. To deal with these two issues, we propose a novel Hierarchical Message-passing Graph Neural Networks framework. The key idea is generating a hierarchical structure that re-organises all nodes in a flat graph into multi-level super graphs, along with innovative intra- and inter-level propagation manners. The derived hierarchy creates shortcuts connecting far-away nodes so that informative long-range interactions can be efficiently accessed via message passing and incorporates meso- and macro-level semantics into the learned node representations. We present the first model to implement this framework, termed Hierarchical Community-aware Graph Neural Network (HC-GNN), with the assistance of a hierarchical community detection algorithm. The theoretical analysis illustrates HC-GNN's remarkable capacity in capturing long-range information without introducing heavy additional computation complexity. Empirical experiments conducted on 9 datasets under transductive, inductive, and few-shot settings exhibit that HC-GNN can outperform state-of-the-art GNN models in network analysis tasks, including node classification, link prediction, and community detection. Moreover, the model analysis further demonstrates HC-GNN's robustness facing graph sparsity and the flexibility in incorporating different GNN encoders. △ Less

Submitted 26 October, 2022; v1 submitted 8 September, 2020; originally announced September 2020.

arXiv:2008.13014 [pdf, other]

doi 10.1103/PhysRevD.102.114515

$DDK$ system in finite volume

Authors: **-Yi Pang, Jia-Jun Wu, Li-Sheng Geng

Abstract: The $DDK$ 3-body system is supposed to be bound due to the strongly attractive interaction between the $D$ meson and the $K$ meson in the isospin zero channel. The minimum quark content of this 3-body bound state is $cc\bar{q}\bar{s}$ with $q=u,d$. It will be an explicitly exotic tetraquark state once discovered. In order to confirm the phenomenological study of the $DDK$ system, we can refer to l… ▽ More The $DDK$ 3-body system is supposed to be bound due to the strongly attractive interaction between the $D$ meson and the $K$ meson in the isospin zero channel. The minimum quark content of this 3-body bound state is $cc\bar{q}\bar{s}$ with $q=u,d$. It will be an explicitly exotic tetraquark state once discovered. In order to confirm the phenomenological study of the $DDK$ system, we can refer to lattice QCD as a powerful theoretical tool parallel to the experiment measurement. In this paper, a 3-body quantization condition scheme is derived via the non-relativistic effective theory and the particle-dimer picture in finite volume. Lattice spectrum of this 3-body system is calculated within the existing model inputs. The spectrum shows various interesting properties of the $DDK$ system, and it may reveal the nature of the $D^*(2317)$. This predicated spectrum is expected to be tested in future lattice simulations. △ Less

Submitted 29 August, 2020; originally announced August 2020.

Comments: 21 pages, 8 figures

Journal ref: Phys. Rev. D 102, 114515 (2020)

arXiv:2008.10032 [pdf, other]

Seesaw Loss for Long-Tailed Instance Segmentation

Authors: Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin

Abstract: Instance segmentation has witnessed a remarkable progress on class-balanced benchmarks. However, they fail to perform as accurately in real-world scenarios, where the category distribution of objects naturally comes with a long tail. Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories. The overwhelming gradients of negative samples on tail… ▽ More Instance segmentation has witnessed a remarkable progress on class-balanced benchmarks. However, they fail to perform as accurately in real-world scenarios, where the category distribution of objects naturally comes with a long tail. Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories. The overwhelming gradients of negative samples on tail classes lead to a biased learning process for classifiers. Consequently, objects of tail categories are more likely to be misclassified as backgrounds or head categories. To tackle this problem, we propose Seesaw Loss to dynamically re-balance gradients of positive and negative samples for each category, with two complementary factors, i.e., mitigation factor and compensation factor. The mitigation factor reduces punishments to tail categories w.r.t. the ratio of cumulative training instances between different categories. Meanwhile, the compensation factor increases the penalty of misclassified instances to avoid false positives of tail categories. We conduct extensive experiments on Seesaw Loss with mainstream frameworks and different data sampling strategies. With a simple end-to-end training pipeline, Seesaw Loss obtains significant gains over Cross-Entropy Loss, and achieves state-of-the-art performance on LVIS dataset without bells and whistles. Code is available at https://github.com/open-mmlab/mmdetection. △ Less

Submitted 17 June, 2021; v1 submitted 23 August, 2020; originally announced August 2020.

Comments: CVPR 2021 Camera Ready

arXiv:2008.07746 [pdf]

Glass sha** at nanoscale: Mechanical forming of brittle amorphous silica by engineered inelastic interaction of scanning electrons with matter

Authors: Sung-gyu Kang, Kyeongjae Jeong, Woo ** Cho, Jeongin Paeng, Jae-Pyeong Ahn, Steven Boles, Heung Nam Han, In-Suk Choi

Abstract: Amorphous silica deforms viscoplastically at elevated temperatures, as is common for brittle glasses. The key mechanism of viscoplastic deformation involves interatomic bond switching, which is known to be a thermally activated process. In this study, through systematic in-situ compression tests by scanning electron microscopy, the viscoplastic deformation of amorphous silica is observed without t… ▽ More Amorphous silica deforms viscoplastically at elevated temperatures, as is common for brittle glasses. The key mechanism of viscoplastic deformation involves interatomic bond switching, which is known to be a thermally activated process. In this study, through systematic in-situ compression tests by scanning electron microscopy, the viscoplastic deformation of amorphous silica is observed without thermal activation. Furthermore, ductility does not increase monotonically with acceleration voltage and current density of the SEM e-beam but is maximized by a factor of three at a specific acceleration voltage and current density conditions (compared to beam-off conditions). A Monte Carlo simulation of the electron-matter interaction shows that the unique trends of viscoplastic deformation correlate with the interaction volume, i.e., the region within the material where inelastic electron scattering occurs. Changing the size of the migrating atomic clusters can lead to facility in rearrangements of the intramolecular bonds, hence leading to more sustained bond switching. Based on the interaction volume the mechanical sha** of small-scale amorphous silica structures under e-beam irradiation can be modeled with high-precision supporting the idea that this relatively low-voltage e-beam-irradiation induced viscoplastic-deformation technique holds great potential for advancing amorphous silica structure manufacturing and develo** e-beam assisted manufacturing for covalently bonded non-metallic materials. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: 29 pages, 5 Figures

Showing 101–150 of 296 results for author: Pang, J