-
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation
Authors:
Junkai Zhou,
Liang Pang,
Huawei Shen,
Xueqi Cheng
Abstract:
Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue. However, for the persona-based dialogue generation task, consistency and coherence are also key factors, which are great challenges for language models. Existing works mainly focus on valuable data filtering, model structure modifying, or objective function designing, while their improvem…
▽ More
Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue. However, for the persona-based dialogue generation task, consistency and coherence are also key factors, which are great challenges for language models. Existing works mainly focus on valuable data filtering, model structure modifying, or objective function designing, while their improvements are limited and hard to generalize to all types of pre-trained language models. However, we find that language models can produce consistent and coherent responses if we consider enough generations. Thus, the problems lay in large-scale response generation and target response selection. In this work, a simple but effective two-stage SimOAP strategy is proposed, i.e., over-sampling and post-evaluation. The over-sampling stage takes large-scale responses from existing trained models efficiently via off-the-shelf distilling and compressing methods, and the post-evaluation stage selects a good response based on multiple well-designed evaluation metrics from large-scale candidates. Experimental results show that the proposed plug-in SimOAP strategy improves the backbone models and outperforms the baseline strategies in both automatic and human evaluations.
△ Less
Submitted 20 May, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval
Authors:
Shicheng Xu,
Liang Pang,
Huawei Shen,
Xueqi Cheng
Abstract:
Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets. However, previous studies have found that dense retrieval is hard to generalize to unseen domains due to its weak modeling of domain-invariant and interpretable feature (i.e., matching signal between two texts, which is the essence of information retrieval). In this paper, we propose…
▽ More
Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets. However, previous studies have found that dense retrieval is hard to generalize to unseen domains due to its weak modeling of domain-invariant and interpretable feature (i.e., matching signal between two texts, which is the essence of information retrieval). In this paper, we propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM. Fully fine-grained expression and query-oriented saliency are two properties of the matching signal. Thus, in BERM, a single passage is segmented into multiple units and two unit-level requirements are proposed for representation as the constraint in training to obtain the effective matching signal. One is semantic unit balance and the other is essential matching unit extractability. Unit-level view and balanced semantics make representation express the text in a fine-grained manner. Essential matching unit extractability makes passage representation sensitive to the given query to extract the pure matching information from the passage containing complex context. Experiments on BEIR show that our method can be effectively combined with different dense retrieval training methods (vanilla, hard negatives mining and knowledge distillation) to improve its generalization ability without any additional inference overhead and target domain data.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Unsupervised Hyperspectral Pansharpening via Low-rank Diffusion Model
Authors:
Xiangyu Rui,
Xiangyong Cao,
Li Pang,
Zeyu Zhu,
Zongsheng Yue,
Deyu Meng
Abstract:
Hyperspectral pansharpening is a process of merging a high-resolution panchromatic (PAN) image and a low-resolution hyperspectral (LRHS) image to create a single high-resolution hyperspectral (HRHS) image. Existing Bayesian-based HS pansharpening methods require designing handcraft image prior to characterize the image features, and deep learning-based HS pansharpening methods usually require a la…
▽ More
Hyperspectral pansharpening is a process of merging a high-resolution panchromatic (PAN) image and a low-resolution hyperspectral (LRHS) image to create a single high-resolution hyperspectral (HRHS) image. Existing Bayesian-based HS pansharpening methods require designing handcraft image prior to characterize the image features, and deep learning-based HS pansharpening methods usually require a large number of paired training data and suffer from poor generalization ability. To address these issues, in this work, we propose a low-rank diffusion model for hyperspectral pansharpening by simultaneously leveraging the power of the pre-trained deep diffusion model and better generalization ability of Bayesian methods. Specifically, we assume that the HRHS image can be recovered from the product of two low-rank tensors, i.e., the base tensor and the coefficient matrix. The base tensor lies on the image field and has a low spectral dimension. Thus, we can conveniently utilize a pre-trained remote sensing diffusion model to capture its image structures. Additionally, we derive a simple yet quite effective way to pre-estimate the coefficient matrix from the observed LRHS image, which preserves the spectral information of the HRHS. Experimental results demonstrate that the proposed method performs better than some popular traditional approaches and gains better generalization ability than some DL-based methods. The code is released in https://github.com/xyrui/PLRDiff.
△ Less
Submitted 19 November, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Visual Transformation Telling
Authors:
Wanqing Cui,
Xin Hong,
Yanyan Lan,
Liang Pang,
Jiafeng Guo,
Xueqi Cheng
Abstract:
Humans can naturally reason from superficial state differences (e.g. ground wetness) to transformations descriptions (e.g. raining) according to their life experience. In this paper, we propose a new visual reasoning task to test this transformation reasoning ability in real-world scenarios, called \textbf{V}isual \textbf{T}ransformation \textbf{T}elling (VTT). Given a series of states (i.e. image…
▽ More
Humans can naturally reason from superficial state differences (e.g. ground wetness) to transformations descriptions (e.g. raining) according to their life experience. In this paper, we propose a new visual reasoning task to test this transformation reasoning ability in real-world scenarios, called \textbf{V}isual \textbf{T}ransformation \textbf{T}elling (VTT). Given a series of states (i.e. images), VTT requires to describe the transformation occurring between every two adjacent states. Different from existing visual reasoning tasks that focus on surface state reasoning, the advantage of VTT is that it captures the underlying causes, e.g. actions or events, behind the differences among states. We collect a novel dataset to support the study of transformation reasoning from two existing instructional video datasets, CrossTask and COIN, comprising 13,547 samples. Each sample involves the key state images along with their transformation descriptions. Our dataset covers diverse real-world activities, providing a rich resource for training and evaluation. To construct an initial benchmark for VTT, we test several models, including traditional visual storytelling methods (CST, GLACNet, Densecap) and advanced multimodal large language models (LLaVA v1.5-7B, Qwen-VL-chat, Gemini Pro Vision, GPT-4o, and GPT-4). Experimental results reveal that even state-of-the-art models still face challenges in VTT, highlighting substantial areas for improvement.
△ Less
Submitted 11 June, 2024; v1 submitted 3 May, 2023;
originally announced May 2023.
-
Visual Reasoning: from State to Transformation
Authors:
Xin Hong,
Yanyan Lan,
Liang Pang,
Jiafeng Guo,
Xueqi Cheng
Abstract:
Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an important factor, i.e.~transformation. They are solely defined to test how well machines understand concepts and relations within static settings, like one image. Such \textbf{state driven} visual reasoning has limitations in reflecting the ability to infer the dynamics between different states, which has shown to be equally imp…
▽ More
Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an important factor, i.e.~transformation. They are solely defined to test how well machines understand concepts and relations within static settings, like one image. Such \textbf{state driven} visual reasoning has limitations in reflecting the ability to infer the dynamics between different states, which has shown to be equally important for human cognition in Piaget's theory. To tackle this problem, we propose a novel \textbf{transformation driven} visual reasoning (TVR) task. Given both the initial and final states, the target becomes to infer the corresponding intermediate transformation. Following this definition, a new synthetic dataset namely TRANCE is first constructed on the basis of CLEVR, including three levels of settings, i.e.~Basic (single-step transformation), Event (multi-step transformation), and View (multi-step transformation with variant views). Next, we build another real dataset called TRANCO based on COIN, to cover the loss of transformation diversity on TRANCE. Inspired by human reasoning, we propose a three-staged reasoning framework called TranNet, including observing, analyzing, and concluding, to test how recent advanced techniques perform on TVR. Experimental results show that the state-of-the-art visual reasoning models perform well on Basic, but are still far from human-level intelligence on Event, View, and TRANCO. We believe the proposed new paradigm will boost the development of machine visual reasoning. More advanced methods and new problems need to be investigated in this direction. The resource of TVR is available at \url{https://hongxin2019.github.io/TVR/}.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
Authors:
Shicheng Xu,
Liang Pang,
Huawei Shen,
Xueqi Cheng,
Tat-Seng Chua
Abstract:
Making the content generated by Large Language Model (LLM), accurate, credible and traceable is crucial, especially in complex knowledge-intensive tasks that require multi-step reasoning and each step needs knowledge to solve. Retrieval-augmented generation is good potential to solve this problem. However, where and how to introduce Information Retrieval (IR) to LLM is a big challenge. Previous wo…
▽ More
Making the content generated by Large Language Model (LLM), accurate, credible and traceable is crucial, especially in complex knowledge-intensive tasks that require multi-step reasoning and each step needs knowledge to solve. Retrieval-augmented generation is good potential to solve this problem. However, where and how to introduce Information Retrieval (IR) to LLM is a big challenge. Previous work has the problems that wrong knowledge retrieved by IR misleads the LLM and interaction between IR and LLM breaks the reasoning chain of LLM. This paper proposes a novel framework named \textbf{Search-in-the-Chain} (SearChain) for the interaction between LLM and IR to solve the challenges. First, LLM generates the reasoning chain named Chain-of-Query (CoQ) where each node consists of an IR-oriented query-answer pair. Second, IR verifies the answer of each node of CoQ. It corrects the answer that is not consistent with the retrieved information when IR gives high confidence, which improves the credibility. Third, LLM can indicate its missing knowledge in CoQ and rely on IR to provide this knowledge to LLM. These operations improve the accuracy in terms of reasoning and knowledge. Finally, SearChain generates the reasoning process and marks references to supporting documents for each reasoning step, which improves traceability. Interaction with IR in SearChain forms a novel reasoning path based on a tree, which enables LLM to dynamically modify the direction of reasoning. Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks including multi-hop Q\&A, slot filling, fact checking, and long-form Q\&A.
△ Less
Submitted 24 February, 2024; v1 submitted 28 April, 2023;
originally announced April 2023.
-
Mask and Restore: Blind Backdoor Defense at Test Time with Masked Autoencoder
Authors:
Tao Sun,
Lu Pang,
Chao Chen,
Haibin Ling
Abstract:
Deep neural networks are vulnerable to backdoor attacks, where an adversary maliciously manipulates the model behavior through overlaying images with special triggers. Existing backdoor defense methods often require accessing a few validation data and model parameters, which are impractical in many real-world applications, e.g., when the model is provided as a cloud service. In this paper, we addr…
▽ More
Deep neural networks are vulnerable to backdoor attacks, where an adversary maliciously manipulates the model behavior through overlaying images with special triggers. Existing backdoor defense methods often require accessing a few validation data and model parameters, which are impractical in many real-world applications, e.g., when the model is provided as a cloud service. In this paper, we address the practical task of blind backdoor defense at test time, in particular for black-box models. The true label of every test image needs to be recovered on the fly from a suspicious model regardless of image benignity. We focus on test-time image purification methods that incapacitate possible triggers while kee** semantic contents intact. Due to diverse trigger patterns and sizes, the heuristic trigger search in image space can be unscalable. We circumvent such barrier by leveraging the strong reconstruction power of generative models, and propose a framework of Blind Defense with Masked AutoEncoder (BDMAE). It detects possible triggers in the token space using image structural similarity and label consistency between the test image and MAE restorations. The detection results are then refined by considering trigger topology. Finally, we fuse MAE restorations adaptively into a purified image for making prediction. Our approach is blind to the model architectures, trigger patterns and image benignity. Extensive experiments under different backdoor settings validate its effectiveness and generalizability. Code is available at https://github.com/tsun/BDMAE.
△ Less
Submitted 2 October, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Exploring QCD matter in extreme conditions with Machine Learning
Authors:
Kai Zhou,
Lingxiao Wang,
Long-Gang Pang,
Shuzhe Shi
Abstract:
In recent years, machine learning has emerged as a powerful computational tool and novel problem-solving perspective for physics, offering new avenues for studying strongly interacting QCD matter properties under extreme conditions. This review article aims to provide an overview of the current state of this intersection of fields, focusing on the application of machine learning to theoretical stu…
▽ More
In recent years, machine learning has emerged as a powerful computational tool and novel problem-solving perspective for physics, offering new avenues for studying strongly interacting QCD matter properties under extreme conditions. This review article aims to provide an overview of the current state of this intersection of fields, focusing on the application of machine learning to theoretical studies in high energy nuclear physics. It covers diverse aspects, including heavy ion collisions, lattice field theory, and neutron stars, and discuss how machine learning can be used to explore and facilitate the physics goals of understanding QCD matter. The review also provides a commonality overview from a methodology perspective, from data-driven perspective to physics-driven perspective. We conclude by discussing the challenges and future prospects of machine learning applications in high energy nuclear physics, also underscoring the importance of incorporating physics priors into the purely data-driven learning toolbox. This review highlights the critical role of machine learning as a valuable computational paradigm for advancing physics exploration in high energy nuclear physics.
△ Less
Submitted 1 December, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
High energy nuclear physics meets Machine Learning
Authors:
Wan-Bing He,
Yu-Gang Ma,
Long-Gang Pang,
Huichao Song,
Kai Zhou
Abstract:
Though being seemingly disparate and with relatively new intersection, high energy nuclear physics and machine learning have already begun to merge and yield interesting results during the last few years. It's worthy to raise the profile of utilizing this novel mindset from machine learning in high energy nuclear physics, to help more interested readers see the breadth of activities around this in…
▽ More
Though being seemingly disparate and with relatively new intersection, high energy nuclear physics and machine learning have already begun to merge and yield interesting results during the last few years. It's worthy to raise the profile of utilizing this novel mindset from machine learning in high energy nuclear physics, to help more interested readers see the breadth of activities around this intersection. The aim of this mini-review is to introduce to the community the current status and report an overview of applying machine learning for high energy nuclear physics, to present from different aspects and examples how scientific questions involved in high energy nuclear physics can be tackled using machine learning.
△ Less
Submitted 12 March, 2023;
originally announced March 2023.
-
Solving Schrodinger equations using physically constrained neural network
Authors:
Kai-Fang Pu,
Hanlin Li,
Hong-Liang Lu,
Long-Gang Pang
Abstract:
Deep neural network (DNN) and auto differentiation have been widely used in computational physics to solve variational problems. When DNN is used to represent the wave function to solve quantum many-body problems using variational optimization, various physical constraints have to be injected into the neural network by construction, to increase the data and learning efficiency. We build the unitar…
▽ More
Deep neural network (DNN) and auto differentiation have been widely used in computational physics to solve variational problems. When DNN is used to represent the wave function to solve quantum many-body problems using variational optimization, various physical constraints have to be injected into the neural network by construction, to increase the data and learning efficiency. We build the unitary constraint to the variational wave function using a monotonic neural network to represent the Cumulative Distribution Function (CDF) $F(x) = \int_{-\infty}^{x} ψ^*ψdx'$. Using this constrained neural network to represent the variational wave function, we solve Schrodinger equations using auto-differentiation and stochastic gradient descent (SGD), by minimizing the violation of the trial wave function $ψ(x)$ to the Schrodinger equation. For several classical problems in quantum mechanics, we obtain their ground state wave function and energy with very low errors. The method developed in the present paper may pave a new way in solving nuclear many body problems in the future.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Multi-video Moment Ranking with Multimodal Clue
Authors:
Danyang Hou,
Liang Pang,
Yanyan Lan,
Huawei Shen,
Xueqi Cheng
Abstract:
Video corpus moment retrieval~(VCMR) is the task of retrieving a relevant video moment from a large corpus of untrimmed videos via a natural language query. State-of-the-art work for VCMR is based on two-stage method. In this paper, we focus on improving two problems of two-stage method: (1) Moment prediction bias: The predicted moments for most queries come from the top retrieved videos, ignoring…
▽ More
Video corpus moment retrieval~(VCMR) is the task of retrieving a relevant video moment from a large corpus of untrimmed videos via a natural language query. State-of-the-art work for VCMR is based on two-stage method. In this paper, we focus on improving two problems of two-stage method: (1) Moment prediction bias: The predicted moments for most queries come from the top retrieved videos, ignoring the possibility that the target moment is in the bottom retrieved videos, which is caused by the inconsistency of Shared Normalization during training and inference. (2) Latent key content: Different modalities of video have different key information for moment localization. To this end, we propose a two-stage model \textbf{M}ult\textbf{I}-video ra\textbf{N}king with m\textbf{U}l\textbf{T}imodal clu\textbf{E}~(MINUTE). MINUTE uses Shared Normalization during both training and inference to rank candidate moments from multiple videos to solve moment predict bias, making it more efficient to predict target moment. In addition, Mutilmdaol Clue Mining~(MCM) of MINUTE can discover key content of different modalities in video to localize moment more accurately. MINUTE outperforms the baselines on TVR and DiDeMo datasets, achieving a new state-of-the-art of VCMR. Our code will be available at GitHub.
△ Less
Submitted 29 January, 2023;
originally announced January 2023.
-
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding
Authors:
Yunchang Zhu,
Liang Pang,
Kangxi Wu,
Yanyan Lan,
Huawei Shen,
Xueqi Cheng
Abstract:
Current natural language understanding (NLU) models have been continuously scaling up, both in terms of model size and input context, introducing more hidden and input neurons. While this generally improves performance on average, the extra neurons do not yield a consistent improvement for all instances. This is because some hidden neurons are redundant, and the noise mixed in input neurons tends…
▽ More
Current natural language understanding (NLU) models have been continuously scaling up, both in terms of model size and input context, introducing more hidden and input neurons. While this generally improves performance on average, the extra neurons do not yield a consistent improvement for all instances. This is because some hidden neurons are redundant, and the noise mixed in input neurons tends to distract the model. Previous work mainly focuses on extrinsically reducing low-utility neurons by additional post- or pre-processing, such as network pruning and context selection, to avoid this problem. Beyond that, can we make the model reduce redundant parameters and suppress input noise by intrinsically enhancing the utility of each neuron? If a model can efficiently utilize neurons, no matter which neurons are ablated (disabled), the ablated submodel should perform no better than the original full model. Based on such a comparison principle between models, we propose a cross-model comparative loss for a broad range of tasks. Comparative loss is essentially a ranking loss on top of the task-specific losses of the full and ablated models, with the expectation that the task-specific loss of the full model is minimal. We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks based on 5 widely used pretrained language models and find it particularly superior for models with few parameters or long input.
△ Less
Submitted 9 March, 2024; v1 submitted 9 January, 2023;
originally announced January 2023.
-
NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework
Authors:
Shicheng Xu,
Liang Pang,
Huawei Shen,
Xueqi Cheng
Abstract:
Information retrieval aims to find information that meets users' needs from the corpus. Different needs correspond to different IR tasks such as document retrieval, open-domain question answering, retrieval-based dialogue, etc., while they share the same schema to estimate the relationship between texts. It indicates that a good IR model can generalize to different tasks and domains. However, prev…
▽ More
Information retrieval aims to find information that meets users' needs from the corpus. Different needs correspond to different IR tasks such as document retrieval, open-domain question answering, retrieval-based dialogue, etc., while they share the same schema to estimate the relationship between texts. It indicates that a good IR model can generalize to different tasks and domains. However, previous studies indicate that state-of-the-art neural information retrieval (NIR) models, e.g, pre-trained language models (PLMs) are hard to generalize. Mainly because the end-to-end fine-tuning paradigm makes the model overemphasize task-specific signals and domain biases but loses the ability to capture generalized essential signals. To address this problem, we propose a novel NIR training framework named NIR-Prompt for retrieval and reranking stages based on the idea of decoupling signal capturing and combination. NIR-Prompt exploits Essential Matching Module (EMM) to capture the essential matching signals and gets the description of tasks by Matching Description Module (MDM). The description is used as task-adaptation information to combine the essential matching signals to adapt to different tasks. Experiments under in-domain multi-task, out-of-domain multi-task, and new task adaptation settings show that NIR-Prompt can improve the generalization of PLMs in NIR for both retrieval and reranking stages compared with baselines.
△ Less
Submitted 18 December, 2023; v1 submitted 30 November, 2022;
originally announced December 2022.
-
A Remote Baby Surveillance System with RFID and GPS Tracking
Authors:
Ruven A/L Sundarajoo,
Gwo Chin Chung,
Wai Leong Pang,
Soo Fun Tan
Abstract:
In the 21st century, sending babies or children to daycare centres has become more and more common among young guardians. The balance between full-time work and child care is increasingly challenging nowadays. In Malaysia, thousands of child abuse cases have been reported from babysitting centres every year, which indeed triggers the anxiety and stress of the guardians. Hence, this paper proposes…
▽ More
In the 21st century, sending babies or children to daycare centres has become more and more common among young guardians. The balance between full-time work and child care is increasingly challenging nowadays. In Malaysia, thousands of child abuse cases have been reported from babysitting centres every year, which indeed triggers the anxiety and stress of the guardians. Hence, this paper proposes to construct a remote baby surveillance system with radio-frequency identification (RFID) and global positioning system (GPS) tracking. With the incorporation of the Internet of Things (IoT), a sensor-based microcontroller is used to detect the conditions of the baby as well as the surrounding environment and then display the real-time data as well as notifications to alert the guardians via a mobile application. These conditions include the crying and waking of the baby, as well as temperature, the mattress's wetness, and moving objects around the baby. In addition, RFID and GPS location tracking are implemented to ensure the safety of the baby, while white noise is used to increase the comfort of the baby. In the end, a prototype has been successfully developed for functionality and reliability testing. Several experiments have been conducted to measure the efficiency of the mattress's wetness detection, the RFID transmission range, the frequency spectrum of white noise, and also the output power of the solar panel. The proposed system is expected to assist guardians in ensuring the safety and comfort of their babies remotely, as well as prevent any occurrence of child abuse.
△ Less
Submitted 26 November, 2022;
originally announced November 2022.
-
Backdoor Cleansing with Unlabeled Data
Authors:
Lu Pang,
Tao Sun,
Haibin Ling,
Chao Chen
Abstract:
Due to the increasing computational demand of Deep Neural Networks (DNNs), companies and organizations have begun to outsource the training process. However, the externally trained DNNs can potentially be backdoor attacked. It is crucial to defend against such attacks, i.e., to postprocess a suspicious model so that its backdoor behavior is mitigated while its normal prediction power on clean inpu…
▽ More
Due to the increasing computational demand of Deep Neural Networks (DNNs), companies and organizations have begun to outsource the training process. However, the externally trained DNNs can potentially be backdoor attacked. It is crucial to defend against such attacks, i.e., to postprocess a suspicious model so that its backdoor behavior is mitigated while its normal prediction power on clean inputs remain uncompromised. To remove the abnormal backdoor behavior, existing methods mostly rely on additional labeled clean samples. However, such requirement may be unrealistic as the training data are often unavailable to end users. In this paper, we investigate the possibility of circumventing such barrier. We propose a novel defense method that does not require training labels. Through a carefully designed layer-wise weight re-initialization and knowledge distillation, our method can effectively cleanse backdoor behaviors of a suspicious network with negligible compromise in its normal behavior. In experiments, we show that our method, trained without labels, is on-par with state-of-the-art defense methods trained using labels. We also observe promising defense results even on out-of-distribution data. This makes our method very practical. Code is available at: https://github.com/luluppang/BCU.
△ Less
Submitted 29 June, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Deep-learning quasi-particle masses from QCD equation of state
Authors:
Fu-Peng Li,
Hong-Liang Lü,
Long-Gang Pang,
Guang-You Qin
Abstract:
The interactions of quarks and gluons are strong at non-perturbative region. The equation of state (EoS) of a strongly-interacting quantum chromodynamics (QCD) medium can only be studied using the first-principle lattice QCD calculations. However, the complicated QCD EoS can be reproduced using simple statistical formula by treating the medium as a free parton gas whose fundamental degree of freed…
▽ More
The interactions of quarks and gluons are strong at non-perturbative region. The equation of state (EoS) of a strongly-interacting quantum chromodynamics (QCD) medium can only be studied using the first-principle lattice QCD calculations. However, the complicated QCD EoS can be reproduced using simple statistical formula by treating the medium as a free parton gas whose fundamental degree of freedoms are dressed quarks and gluons called quasi-particles, with temperature-dependent masses. We use deep neural network and auto differentiation to solve this variational problem in which the masses of quasi gluons, up/down and strange quarks are three unknown functions, whose forms are represented by deep neural network. We reproduce the QCD EoS using these machine learned quasi-particle masses, and calculate the shear viscosity over entropy density ($η/s$) as a function of temperature of the hot QCD matter.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Composite Fixed-Length Ordered Features for Palmprint Template Protection with Diminished Performance Loss
Authors:
Weiqiang Zhao,
Heng Zhao,
Zhicheng Cao,
Liaojun Pang
Abstract:
Palmprint recognition has become more and more popular due to its advantages over other biometric modalities such as fingerprint, in that it is larger in area, richer in information and able to work at a distance. However, the issue of palmprint privacy and security (especially palmprint template protection) remains under-studied. Among the very few research works, most of them only use the direct…
▽ More
Palmprint recognition has become more and more popular due to its advantages over other biometric modalities such as fingerprint, in that it is larger in area, richer in information and able to work at a distance. However, the issue of palmprint privacy and security (especially palmprint template protection) remains under-studied. Among the very few research works, most of them only use the directional and orientation features of the palmprint with transformation processing, yielding unsatisfactory protection and identification performance. Thus, this paper proposes a palmprint template protection-oriented operator that has a fixed length and is ordered in nature, by fusing point features and orientation features. Firstly, double orientations are extracted with more accuracy based on MFRAT. Then key points of SURF are extracted and converted to be fixed-length and ordered features. Finally, composite features that fuse up the double orientations and SURF points are transformed using the irreversible transformation of IOM to generate the revocable palmprint template. Experiments show that the EER after irreversible transformation on the PolyU and CASIA databases are 0.17% and 0.19% respectively, and the absolute precision loss is 0.08% and 0.07%, respectively, which proves the advantage of our method.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows
Authors:
Anyi Rao,
Xuekun Jiang,
Sichen Wang,
Yuwei Guo,
Zihao Liu,
Bo Dai,
Long Pang,
Xiaoyu Wu,
Dahua Lin,
Libiao **
Abstract:
The ability to choose an appropriate camera view among multiple cameras plays a vital role in TV shows delivery. But it is hard to figure out the statistical pattern and apply intelligent processing due to the lack of high-quality training data. To solve this issue, we first collect a novel benchmark on this setting with four diverse scenarios including concerts, sports games, gala shows, and cont…
▽ More
The ability to choose an appropriate camera view among multiple cameras plays a vital role in TV shows delivery. But it is hard to figure out the statistical pattern and apply intelligent processing due to the lack of high-quality training data. To solve this issue, we first collect a novel benchmark on this setting with four diverse scenarios including concerts, sports games, gala shows, and contests, where each scenario contains 6 synchronized tracks recorded by different cameras. It contains 88-hour raw videos that contribute to the 14-hour edited videos. Based on this benchmark, we further propose a new approach temporal and contextual transformer that utilizes clues from historical shots and other views to make shot transition decisions and predict which view to be used. Extensive experiments show that our method outperforms existing methods on the proposed multi-camera editing benchmark.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Baryonic spin Hall effects in Au+Au collisions at $\sqrt{s_{NN}} = 7.7-200$ GeV
Authors:
Baochi Fu,
Longgang Pang,
Huichao Song,
Yi Yin
Abstract:
In this proceeding, we present our recent prediction on the local net Lambda polarization to search for the baryonic spin Hall effect (SHE) at RHIC BES energies. The baryonic SHE is induced by the gradients of baryon chemical potential, which leads to local polarization separation between baryons and anti-baryons. Based on hydrodynamic simulations with spin Cooper-Fryer formula, we propose to use…
▽ More
In this proceeding, we present our recent prediction on the local net Lambda polarization to search for the baryonic spin Hall effect (SHE) at RHIC BES energies. The baryonic SHE is induced by the gradients of baryon chemical potential, which leads to local polarization separation between baryons and anti-baryons. Based on hydrodynamic simulations with spin Cooper-Fryer formula, we propose to use $P^{\rm net}_{2,y}$ and $P^{\rm net}_{2,z}$, the second Fourier coefficients of net spin polarization to quantify this baryonic SHE. Future experimental observation of their non-trivial signatures could strongly support the existence of the baryon SHE in hot and dense QCD matter.
△ Less
Submitted 31 July, 2022;
originally announced August 2022.
-
A Data-driven Latent Semantic Analysis for Automatic Text Summarization using LDA Topic Modelling
Authors:
Daniel F. O. Onah,
Elaine L. L. Pang,
Mahmoud El-Haj
Abstract:
With the advent and popularity of big data mining and huge text analysis in modern times, automated text summarization became prominent for extracting and retrieving important information from documents. This research investigates aspects of automatic text summarization from the perspectives of single and multiple documents. Summarization is a task of condensing huge text articles into short, summ…
▽ More
With the advent and popularity of big data mining and huge text analysis in modern times, automated text summarization became prominent for extracting and retrieving important information from documents. This research investigates aspects of automatic text summarization from the perspectives of single and multiple documents. Summarization is a task of condensing huge text articles into short, summarized versions. The text is reduced in size for summarization purpose but preserving key vital information and retaining the meaning of the original document. This study presents the Latent Dirichlet Allocation (LDA) approach used to perform topic modelling from summarised medical science journal articles with topics related to genes and diseases. In this study, PyLDAvis web-based interactive visualization tool was used to visualise the selected topics. The visualisation provides an overarching view of the main topics while allowing and attributing deep meaning to the prevalence individual topic. This study presents a novel approach to summarization of single and multiple documents. The results suggest the terms ranked purely by considering their probability of the topic prevalence within the processed document using extractive summarization technique. PyLDAvis visualization describes the flexibility of exploring the terms of the topics' association to the fitted LDA model. The topic modelling result shows prevalence within topics 1 and 2. This association reveals that there is similarity between the terms in topic 1 and 2 in this study. The efficacy of the LDA and the extractive summarization methods were measured using Latent Semantic Analysis (LSA) and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics to evaluate the reliability and validity of the model.
△ Less
Submitted 29 May, 2023; v1 submitted 23 July, 2022;
originally announced July 2022.
-
MLMSA: Multi-Label Multi-Side-Channel-Information enabled Deep Learning Attacks on APUF Variants
Authors:
Yansong Gao,
Jianrong Yao,
Lihui Pang,
Wei Yang,
Anmin Fu,
Said F. Al-Sarawi,
Derek Abbott
Abstract:
To improve the modeling resilience of silicon strong physical unclonable functions (PUFs), in particular, the APUFs, that yield a very large number of challenge response pairs (CRPs), a number of composited APUF variants such as XOR-APUF, interpose-PUF (iPUF), feed-forward APUF (FF-APUF),and OAX-APUF have been devised. When examining their security in terms of modeling resilience, utilizing multip…
▽ More
To improve the modeling resilience of silicon strong physical unclonable functions (PUFs), in particular, the APUFs, that yield a very large number of challenge response pairs (CRPs), a number of composited APUF variants such as XOR-APUF, interpose-PUF (iPUF), feed-forward APUF (FF-APUF),and OAX-APUF have been devised. When examining their security in terms of modeling resilience, utilizing multiple information sources such as power side channel information (SCI) or/and reliability SCI given a challenge is under-explored, which poses a challenge to their supposed modeling resilience in practice. Building upon multi-label/head deep learning model architecture,this work proposes Multi-Label Multi-Side-channel-information enabled deep learning Attacks (MLMSA) to thoroughly evaluate the modeling resilience of aforementioned APUF variants. Despite its simplicity, MLMSA can successfully break large-scaled APUF variants, which has not previously been achieved. More precisely, the MLMSA breaks 128-stage 30-XOR-APUF, (9, 9)- and (2, 18)-iPUFs, and (2, 2, 30)-OAX-APUF when CRPs, power SCI and reliability SCI are concurrently used. It breaks 128-stage 12-XOR-APUF and (2, 2, 9)-OAX-APUF even when only the easy-to-obtain reliability SCI and CRPs are exploited. The 128-stage six-loop FF-APUF and one-loop 20-XOR-FF-APUF can be broken by simultaneously using reliability SCI and CRPs. All these attacks are normally completed within an hour with a standard personalcomputer. Therefore, MLMSA is a useful technique for evaluating other existing or any emerging strong PUF designs.
△ Less
Submitted 10 January, 2023; v1 submitted 20 July, 2022;
originally announced July 2022.
-
Deep learning assisted jet tomography for the study of Mach cones in QGP
Authors:
Zhong Yang,
Yayun He,
Wei Chen,
Wei-Yao Ke,
Long-Gang Pang,
Xin-Nian Wang
Abstract:
Mach cones are expected to form in the expanding quark-gluon plasma (QGP) when energetic quarks and gluons (called jets) traverse the hot medium at a velocity faster than the speed of sound in high-energy heavy-ion collisions. The shape of the Mach cone and the associated diffusion wake are sensitive to the initial jet production location and the jet propagation direction relative to the radial fl…
▽ More
Mach cones are expected to form in the expanding quark-gluon plasma (QGP) when energetic quarks and gluons (called jets) traverse the hot medium at a velocity faster than the speed of sound in high-energy heavy-ion collisions. The shape of the Mach cone and the associated diffusion wake are sensitive to the initial jet production location and the jet propagation direction relative to the radial flow because of the distortion by the collective expansion of the QGP and large density gradient. The shape of jet-induced Mach cones and their distortions in heavy-ion collisions provide a unique and direct probe of the dynamical evolution and the equation of state of QGP. However, it is difficult to identify the Mach cone and the diffusion wake in current experimental measurements of final hadron distributions because they are averaged over all possible initial jet production locations and propagation directions. To overcome this difficulty, we develop a deep learning assisted jet tomography which uses the full information of the final hadrons from jets to localize the initial jet production positions. This method can help to constrain the initial regions of jet production in heavy-ion collisions and enable a differential study of Mach-cones with different jet path length and orientation relative to the radial flow of the QGP in heavy-ion collisions.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
A Demographic Attribute Guided Approach to Age Estimation
Authors:
Zhicheng Cao,
Kaituo Zhang,
Liaojun Pang,
Heng Zhao
Abstract:
Face-based age estimation has attracted enormous attention due to wide applications to public security surveillance, human-computer interaction, etc. With vigorous development of deep learning, age estimation based on deep neural network has become the mainstream practice. However, seeking a more suitable problem paradigm for age change characteristics, designing the corresponding loss function an…
▽ More
Face-based age estimation has attracted enormous attention due to wide applications to public security surveillance, human-computer interaction, etc. With vigorous development of deep learning, age estimation based on deep neural network has become the mainstream practice. However, seeking a more suitable problem paradigm for age change characteristics, designing the corresponding loss function and designing a more effective feature extraction module still needs to be studied. What is more, change of face age is also related to demographic attributes such as ethnicity and gender, and the dynamics of different age groups is also quite different. This problem has so far not been paid enough attention to. How to use demographic attribute information to improve the performance of age estimation remains to be further explored. In light of these issues, this research makes full use of auxiliary information of face attributes and proposes a new age estimation approach with an attribute guidance module. We first design a multi-scale attention residual convolution unit (MARCU) to extract robust facial features other than simply using other standard feature modules such as VGG and ResNet. Then, after being especially treated through full connection (FC) layers, the facial demographic attributes are weight-summed by 1*1 convolutional layer and eventually merged with the age features by a global FC layer. Lastly, we propose a new error compression ranking (ECR) loss to better converge the age regression value. Experimental results on three public datasets of UTKFace, LAP2016 and Morph show that our proposed approach achieves superior performance compared to other state-of-the-art methods.
△ Less
Submitted 20 May, 2022;
originally announced May 2022.
-
A Bidirectional Conversion Network for Cross-Spectral Face Recognition
Authors:
Zhicheng Cao,
Jiaxuan Zhang,
Liaojun Pang
Abstract:
Face recognition in the infrared (IR) band has become an important supplement to visible light face recognition due to its advantages of independent background light, strong penetration, ability of imaging under harsh environments such as nighttime, rain and fog. However, cross-spectral face recognition (i.e., VIS to IR) is very challenging due to the dramatic difference between the visible light…
▽ More
Face recognition in the infrared (IR) band has become an important supplement to visible light face recognition due to its advantages of independent background light, strong penetration, ability of imaging under harsh environments such as nighttime, rain and fog. However, cross-spectral face recognition (i.e., VIS to IR) is very challenging due to the dramatic difference between the visible light and IR imageries as well as the lack of paired training data. This paper proposes a framework of bidirectional cross-spectral conversion (BCSC-GAN) between the heterogeneous face images, and designs an adaptive weighted fusion mechanism based on information fusion theory. The network reduces the cross-spectral recognition problem into an intra-spectral problem, and improves performance by fusing bidirectional information. Specifically, a face identity retaining module (IRM) is introduced with the ability to preserve identity features, and a new composite loss function is designed to overcome the modal differences caused by different spectral characteristics. Two datasets of TINDERS and CASIA were tested, where performance metrics of FID, recognition rate, equal error rate and normalized distance were compared. Results show that our proposed network is superior than other state-of-the-art methods. Additionally, the proposed rule of Self Adaptive Weighted Fusion (SAWF) is better than the recognition results of the unfused case and other traditional fusion rules that are commonly used, which further justifies the effectiveness and superiority of the proposed bidirectional conversion approach.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
On the uncertainty principle of neural networks
Authors:
Jun-Jie Zhang,
Dong-Xiao Zhang,
Jian-Nan Chen,
Long-Gang Pang,
Deyu Meng
Abstract:
Despite the successes in many fields, it is found that neural networks are difficult to be both accurate and robust, i.e., high accuracy networks are often vulnerable. Various empirical and analytic studies have substantiated that there is more or less a trade-off between the accuracy and robustness of neural networks. If the property is inherent, applications based on the neural networks are vuln…
▽ More
Despite the successes in many fields, it is found that neural networks are difficult to be both accurate and robust, i.e., high accuracy networks are often vulnerable. Various empirical and analytic studies have substantiated that there is more or less a trade-off between the accuracy and robustness of neural networks. If the property is inherent, applications based on the neural networks are vulnerable with untrustworthy predictions. To more deeply explore and understand this issue, in this study we show that the accuracy-robustness trade-off is an intrinsic property whose underlying mechanism is closely related to the uncertainty principle in quantum mechanics. By relating the loss function in neural networks to the wave function in quantum mechanics, we show that the inputs and their conjugates cannot be resolved by a neural network simultaneously. This work thus provides an insightful explanation for the inevitability of the accuracy-robustness dilemma for general deep networks from an entirely new perspective, and furthermore, reveals a potential possibility to study various properties of neural networks with the mature mathematical tools in quantum physics.
△ Less
Submitted 27 October, 2022; v1 submitted 3 May, 2022;
originally announced May 2022.
-
Adaptive Split-Fusion Transformer
Authors:
Zixuan Su,
Hao Zhang,
**g**g Chen,
Lei Pang,
Chong-Wah Ngo,
Yu-Gang Jiang
Abstract:
Neural networks for visual content understanding have recently evolved from convolutional ones (CNNs) to transformers. The prior (CNN) relies on small-windowed kernels to capture the regional clues, demonstrating solid local expressiveness. On the contrary, the latter (transformer) establishes long-range global connections between localities for holistic learning. Inspired by this complementary na…
▽ More
Neural networks for visual content understanding have recently evolved from convolutional ones (CNNs) to transformers. The prior (CNN) relies on small-windowed kernels to capture the regional clues, demonstrating solid local expressiveness. On the contrary, the latter (transformer) establishes long-range global connections between localities for holistic learning. Inspired by this complementary nature, there is a growing interest in designing hybrid models to best utilize each technique. Current hybrids merely replace convolutions as simple approximations of linear projection or juxtapose a convolution branch with attention, without concerning the importance of local/global modeling. To tackle this, we propose a new hybrid named Adaptive Split-Fusion Transformer (ASF-former) to treat convolutional and attention branches differently with adaptive weights. Specifically, an ASF-former encoder equally splits feature channels into half to fit dual-path inputs. Then, the outputs of dual-path are fused with weighting scalars calculated from visual cues. We also design the convolutional path compactly for efficiency concerns. Extensive experiments on standard benchmarks, such as ImageNet-1K, CIFAR-10, and CIFAR-100, show that our ASF-former outperforms its CNN, transformer counterparts, and hybrid pilots in terms of accuracy (83.9% on ImageNet-1K), under similar conditions (12.9G MACs/56.7M Params, without large-scale pre-training). The code is available at: https://github.com/szx503045266/ASF-former.
△ Less
Submitted 16 August, 2023; v1 submitted 26 April, 2022;
originally announced April 2022.
-
LoL: A Comparative Regularization Loss over Query Reformulation Losses for Pseudo-Relevance Feedback
Authors:
Yunchang Zhu,
Liang Pang,
Yanyan Lan,
Huawei Shen,
Xueqi Cheng
Abstract:
Pseudo-relevance feedback (PRF) has proven to be an effective query reformulation technique to improve retrieval accuracy. It aims to alleviate the mismatch of linguistic expressions between a query and its potential relevant documents. Existing PRF methods independently treat revised queries originating from the same query but using different numbers of feedback documents, resulting in severe que…
▽ More
Pseudo-relevance feedback (PRF) has proven to be an effective query reformulation technique to improve retrieval accuracy. It aims to alleviate the mismatch of linguistic expressions between a query and its potential relevant documents. Existing PRF methods independently treat revised queries originating from the same query but using different numbers of feedback documents, resulting in severe query drift. Without comparing the effects of two different revisions from the same query, a PRF model may incorrectly focus on the additional irrelevant information increased in the more feedback, and thus reformulate a query that is less effective than the revision using the less feedback. Ideally, if a PRF model can distinguish between irrelevant and relevant information in the feedback, the more feedback documents there are, the better the revised query will be. To bridge this gap, we propose the Loss-over-Loss (LoL) framework to compare the reformulation losses between different revisions of the same query during training. Concretely, we revise an original query multiple times in parallel using different amounts of feedback and compute their reformulation losses. Then, we introduce an additional regularization loss on these reformulation losses to penalize revisions that use more feedback but gain larger losses. With such comparative regularization, the PRF model is expected to learn to suppress the extra increased irrelevant information by comparing the effects of different revised queries. Further, we present a differentiable query reformulation method to implement this framework. This method revises queries in the vector space and directly optimizes the retrieval performance of query vectors, applicable for both sparse and dense retrieval models. Empirical evaluation demonstrates the effectiveness and robustness of our method for two typical sparse and dense retrieval models.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Match-Prompt: Improving Multi-task Generalization Ability for Neural Text Matching via Prompt Learning
Authors:
Shicheng Xu,
Liang Pang,
Huawei Shen,
Xueqi Cheng
Abstract:
Text matching is a fundamental technique in both information retrieval and natural language processing. Text matching tasks share the same paradigm that determines the relationship between two given texts. The relationships vary from task to task, e.g.~relevance in document retrieval, semantic alignment in paraphrase identification and answerable judgment in question answering. However, the essent…
▽ More
Text matching is a fundamental technique in both information retrieval and natural language processing. Text matching tasks share the same paradigm that determines the relationship between two given texts. The relationships vary from task to task, e.g.~relevance in document retrieval, semantic alignment in paraphrase identification and answerable judgment in question answering. However, the essential signals for text matching remain in a finite scope, i.e.~exact matching, semantic matching, and inference matching. Ideally, a good text matching model can learn to capture and aggregate these signals for different matching tasks to achieve competitive performance, while recent state-of-the-art text matching models, e.g.~Pre-trained Language Models (PLMs), are hard to generalize. It is because the end-to-end supervised learning on task-specific dataset makes model overemphasize the data sample bias and task-specific signals instead of the essential matching signals. To overcome this problem, we adopt a specialization-generalization training strategy and refer to it as Match-Prompt. In specialization stage, descriptions of different matching tasks are mapped to a few prompt tokens. In generalization stage, matching model explores the essential matching signals by being trained on diverse matching tasks. High diverse matching tasks avoid model fitting the data bias on a specific task, so that model can focus on learning the essential matching signals. Meanwhile, the prompt tokens obtained in the first step help the model distinguish different task-specific matching signals. Experimental results on public datasets show that Match-Prompt can improve multi-task generalization capability of PLMs in text matching and yield better in-domain multi-task, out-of-domain multi-task and new task adaptation performance than multi-task and task-specific models trained by previous fine-tuning paradigm.
△ Less
Submitted 19 August, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Systematically Evaluation of Challenge Obfuscated APUFs
Authors:
Yansong Gao,
Jianrong Yao,
Lihui Pang,
Zhi Zhang,
Anmin Fu,
Naixue Xiong,
Hyoungshick Kim
Abstract:
As a well-known physical unclonable function that can provide huge number of challenge response pairs (CRP) with a compact design and fully compatibility with current electronic fabrication process, the arbiter PUF (APUF) has attracted great attention. To improve its resilience against modeling attacks, many APUF variants have been proposed so far. Though the modeling resilience of response obfusc…
▽ More
As a well-known physical unclonable function that can provide huge number of challenge response pairs (CRP) with a compact design and fully compatibility with current electronic fabrication process, the arbiter PUF (APUF) has attracted great attention. To improve its resilience against modeling attacks, many APUF variants have been proposed so far. Though the modeling resilience of response obfuscated APUF variants such as XOR-APUF and lightweight secure APUF (LSPUF) have been well studied, the challenge obfuscated APUFs (CO-APUFs) such as feed-forward APUF (FF-APUF), and XOR-FF-APUF are less elucidated, especially, with the deep learning (DL) methods. This work systematically evaluates five CO-APUFs including three influential designs of FF-APUF, XOR-FF-APUF, iPUF, one very recently design and our newly optimized design (dubbed as OAX-FF-APUF), in terms of their reliability, uniformity (related to uniqueness), and modeling resilience. Three DL techniques of GRU, TCN and MLP are employed to examine these CO-APUFs' modeling resilience -- the first two are newly explored. With computation resource of a common personal computer, we show that all five CO-APUFs with relatively large scale can be successfully modeled -- attacking accuracy higher or close to its reliability. The hyper-parameter tuning of DL technique is crucial for implementing efficient attacks. Increasing the scale of the CO-APUF is validated to be able to improve the resilience but should be done with minimizing the reliability degradation. As the powerful capability of DL technique affirmed by us, we recommend the DL, specifically the MLP technique always demonstrating best efficacy, to be always considered for examining the modeling resilience when newly composited APUFs are devised or to a large extent, other strong PUFs are constructed.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
High Phonon Scattering Rates Suppress Thermal Conductivity in Hyperstoichiometric Uranium Dioxide
Authors:
Hao Ma,
Matthew S. Bryan,
Judy W. L. Pang,
Douglas L. Abernathy,
Daniel J. Antonio,
Krzysztof Gofryk,
Michael E. Manley
Abstract:
Uranium dioxide (UO$_2$), one of the most important nuclear fuels, can accumulate excess oxygen atoms as interstitial defects, which significantly impacts thermal properties. In this study, thermal conductivities and inelastic neutron scattering measurements on UO$_2$ and UO$_{2+x}$ (x=0.3, 0.4, 0.8, 0.11) were performed at low temperatures (2-300 K). The thermal conductivity of UO$_{2+x}$ is sign…
▽ More
Uranium dioxide (UO$_2$), one of the most important nuclear fuels, can accumulate excess oxygen atoms as interstitial defects, which significantly impacts thermal properties. In this study, thermal conductivities and inelastic neutron scattering measurements on UO$_2$ and UO$_{2+x}$ (x=0.3, 0.4, 0.8, 0.11) were performed at low temperatures (2-300 K). The thermal conductivity of UO$_{2+x}$ is significantly suppressed compared to UO$_2$ except near the Néel temperature TN= 30.8 K, where it is independent of x. Phonon measurements demonstrate that the heat capacities and phonon group velocities of UO$_2$ and UO$_{2+x}$ are similar and that the suppressed thermal conductivity in UO$_{2+x}$ results from high phonon scattering rates. These new insights advance our fundamental understanding of thermal transport properties in advanced nuclear fuels.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
3D structure of jet-induced diffusion wake in an expanding quark-gluon plasma
Authors:
Zhong Yang,
Tan Luo,
Wei Chen,
Long-Gang Pang,
Xin-Nian Wang
Abstract:
The diffusion wake accompanying the jet-induced Mach cone provides a unique probe of the properties of quark-gluon plasma in high-energy heavy-ion collisions. It can be characterized by a depletion of soft hadrons in the opposite direction of the propagating jet. We explore the 3D structure of the diffusion wake induced by $γ$-triggered jets in Pb+Pb collisions at the LHC energy within the coupled…
▽ More
The diffusion wake accompanying the jet-induced Mach cone provides a unique probe of the properties of quark-gluon plasma in high-energy heavy-ion collisions. It can be characterized by a depletion of soft hadrons in the opposite direction of the propagating jet. We explore the 3D structure of the diffusion wake induced by $γ$-triggered jets in Pb+Pb collisions at the LHC energy within the coupled linear Boltzmann transport and hydro model. We identify a valley structure caused by the diffusion wake on top of a ridge from the initial multiple parton interaction (MPI) in jet-hadron correlation as a function of rapidity and azimuthal angle. This leads to a double-peak structure in the rapidity distribution of soft hadrons in the opposite direction of the jets as an unambiguous signal of the diffusion wake. Using a two-Gaussian fit, we extract the diffusion wake and MPI contributions to the double peak. The diffusion wake valley is found to deepen with the jet energy loss as characterized by the $γ$-jet asymmetry. Its sensitivity to the equation of state and shear viscosity is also studied.
△ Less
Submitted 26 January, 2023; v1 submitted 7 March, 2022;
originally announced March 2022.
-
Audio-Visual Object Classification for Human-Robot Collaboration
Authors:
A. Xompero,
Y. L. Pang,
T. Patten,
A. Prabhakar,
B. Calli,
A. Cavallaro
Abstract:
Human-robot collaboration requires the contactless estimation of the physical properties of containers manipulated by a person, for example while pouring content in a cup or moving a food box. Acoustic and visual signals can be used to estimate the physical properties of such objects, which may vary substantially in shape, material and size, and also be occluded by the hands of the person. To faci…
▽ More
Human-robot collaboration requires the contactless estimation of the physical properties of containers manipulated by a person, for example while pouring content in a cup or moving a food box. Acoustic and visual signals can be used to estimate the physical properties of such objects, which may vary substantially in shape, material and size, and also be occluded by the hands of the person. To facilitate comparisons and stimulate progress in solving this problem, we present the CORSMAL challenge and a dataset to assess the performance of the algorithms through a set of well-defined performance scores. The tasks of the challenge are the estimation of the mass, capacity, and dimensions of the object (container), and the classification of the type and amount of its content. A novel feature of the challenge is our real-to-simulation framework for visualising and assessing the impact of estimation errors in human-to-robot handovers.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Deep learning on nuclear mass and $α$ decay half-lives
Authors:
Chen-Qi Li,
Chao-Nan Tong,
Hong-**g Du,
Long-Gang Pang
Abstract:
Ab-initio calculations of nuclear masses, the binding energy and the $α$ decay half-lives are intractable for heavy nucleus, because of the curse of dimensionality in many body quantum simulations as proton number($\mathrm{N}$) and neutron number($\mathrm{Z}$) grow. We take advantage of the powerful non-linear transformation and feature representation ability of deep neural network(DNN) to predict…
▽ More
Ab-initio calculations of nuclear masses, the binding energy and the $α$ decay half-lives are intractable for heavy nucleus, because of the curse of dimensionality in many body quantum simulations as proton number($\mathrm{N}$) and neutron number($\mathrm{Z}$) grow. We take advantage of the powerful non-linear transformation and feature representation ability of deep neural network(DNN) to predict the nuclear masses and $α$ decay half-lives. For nuclear binding energy prediction problem we achieve standard deviation $σ=0.263$ MeV on 10-fold cross validation on 2149 nuclei. Word-vectors which are high dimensional representation of nuclei from the hidden layers of mass-regression DNN help us to calculate $α$ decay half-lives. For this task, we get $σ=0.797$ on 100 times 10-fold cross validation on 350 nuclei on $log_{10}T_{1/2}$ and $σ=0.731 $ on 486 nuclei. We also find physical a priori such as shell structure, magic numbers and augmented inputs inspired by Finite Range Droplet Model are important for this small data regression task.
△ Less
Submitted 9 June, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
Signatures of the spin Hall effect in hot and dense QCD matter
Authors:
Baochi Fu,
Longgang Pang,
Huichao Song,
Yi Yin
Abstract:
The spin Hall effect (SHE) is a generation of spin polarization for moving spin carriers in materials under an external electric field and has been observed in semiconductors, metals, and insulators at or below room temperature. Recent theoretical analyses show that spin Hall current can be induced by the baryon chemical potential gradient which plays the role of the analogous electric field and w…
▽ More
The spin Hall effect (SHE) is a generation of spin polarization for moving spin carriers in materials under an external electric field and has been observed in semiconductors, metals, and insulators at or below room temperature. Recent theoretical analyses show that spin Hall current can be induced by the baryon chemical potential gradient which plays the role of the analogous electric field and which becomes sizable in the fireballs created in heavy-ion collisions at beam energy of ${\cal O}(10)$~GeV. In this letter, we study this important mechanism for spin polarization generation that has not been systematically explored before and predict the signature of the SHE in those collisions using a (3+1)~D viscous hydrodynamic model MUSIC with AMPT initial condition. We propose to use the second Fourier coefficients of the net spin polarization of Lambda hyperon as sensitive probes to search for the SHE. Those SHE observables show a qualitative difference in both the sign and beam energy dependence for the situations with and without the SHE. Future experimental observation of these distinct qualitative features would provide strong evidence for the existence of the SHE in the hot and dense QCD matter at trillions of degrees.
△ Less
Submitted 30 January, 2022;
originally announced January 2022.
-
Event-by-event jet anisotropy and hard-soft tomography of the quark-gluon plasma
Authors:
Yayun He,
Wei Chen,
Tan Luo,
Shanshan Cao,
Long-Gang Pang,
Xin-Nian Wang
Abstract:
Suppression of jet spectra or jet quenching in high-energy heavy-ion collisions is caused by jet energy loss in the dense medium. The azimuthal anisotropy of jet energy loss in non-central heavy-ion collisions can lead to jet anisotropy which in turn can provide insight into the path-length dependence of jet quenching. This is investigated within the Linear Boltzmann Transport (LBT) model which si…
▽ More
Suppression of jet spectra or jet quenching in high-energy heavy-ion collisions is caused by jet energy loss in the dense medium. The azimuthal anisotropy of jet energy loss in non-central heavy-ion collisions can lead to jet anisotropy which in turn can provide insight into the path-length dependence of jet quenching. This is investigated within the Linear Boltzmann Transport (LBT) model which simulates both elastic scattering and medium-induced gluon radiation based on perturbative QCD for jet shower and medium recoil partons as well as radiated gluons as they propagate through the quark-gluon plasma (QGP). The dynamical evolution of the QGP in each event of heavy-ion collisions is provided by the (3+1)D CLVisc hydrodynamic model with fully fluctuating initial conditions. This framework has been shown to describe the suppression of single inclusive jet spectra well. We calculate in this study the elliptic ($v_{2}^{\rm jet}$) and triangular ($v_{3}^{\rm jet}$) anisotropy coefficients of the single inclusive jet spectra in Pb+Pb collisions at the LHC energies. We investigate the colliding energy, centrality, jet transverse momentum dependence of the jet anisotropy, as well as their event-by-event correlation with the flow coefficients of the soft bulk hadrons. An approximate linear correlation between jet and bulk $v_2$ is found. Effect of the bulk $v_n$ fluctuation on $v_n^{\rm jet}$ is found negligible. The jet-induced medium excitation, which is influenced by radial flow, is shown to enhance $v_{2}^{\rm jet}$ and the enhancement increases with the jet cone size. The jet elliptic anisotropy $v_{2}^{\rm jet}$ is also found to be slightly enhanced by the shear viscosity of the bulk medium in comparison to the LBT results when jets propagate through an ideal hydrodynamic QGP medium.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
Benchmarking Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective Optimization
Authors:
Ke Shang,
Tianye Shu,
Hisao Ishibuchi,
Yang Nan,
Lie Meng Pang
Abstract:
In the evolutionary multi-objective optimization (EMO) field, the standard practice is to present the final population of an EMO algorithm as the output. However, it has been shown that the final population often includes solutions which are dominated by other solutions generated and discarded in previous generations. Recently, a new EMO framework has been proposed to solve this issue by storing a…
▽ More
In the evolutionary multi-objective optimization (EMO) field, the standard practice is to present the final population of an EMO algorithm as the output. However, it has been shown that the final population often includes solutions which are dominated by other solutions generated and discarded in previous generations. Recently, a new EMO framework has been proposed to solve this issue by storing all the non-dominated solutions generated during the evolution in an archive and selecting a subset of solutions from the archive as the output. The key component in this framework is the subset selection from the archive which usually stores a large number of candidate solutions. However, most studies on subset selection focus on small candidate solution sets for environmental selection. There is no benchmark test suite for large-scale subset selection. This paper aims to fill this research gap by proposing a benchmark test suite for subset selection from large candidate solution sets, and comparing some representative methods using the proposed test suite. The proposed test suite together with the benchmarking studies provides a baseline for researchers to understand, use, compare, and develop subset selection methods in the EMO field.
△ Less
Submitted 29 March, 2022; v1 submitted 17 January, 2022;
originally announced January 2022.
-
Machine Learning in Nuclear Physics
Authors:
Amber Boehnlein,
Markus Diefenthaler,
Cristiano Fanelli,
Morten Hjorth-Jensen,
Tanja Horn,
Michelle P. Kuchera,
Dean Lee,
Witold Nazarewicz,
Kostas Orginos,
Peter Ostroumov,
Long-Gang Pang,
Alan Poon,
Nobuo Sato,
Malachi Schram,
Alexander Scheinker,
Michael S. Smith,
Xin-Nian Wang,
Veronique Ziegler
Abstract:
Advances in machine learning methods provide tools that have broad applicability in scientific research. These techniques are being applied across the diversity of nuclear physics research topics, leading to advances that will facilitate scientific discoveries and societal applications.
This Review gives a snapshot of nuclear physics research which has been transformed by machine learning techni…
▽ More
Advances in machine learning methods provide tools that have broad applicability in scientific research. These techniques are being applied across the diversity of nuclear physics research topics, leading to advances that will facilitate scientific discoveries and societal applications.
This Review gives a snapshot of nuclear physics research which has been transformed by machine learning techniques.
△ Less
Submitted 2 May, 2022; v1 submitted 4 December, 2021;
originally announced December 2021.
-
High-Speed Light Focusing through Scattering Medium by Cooperatively Accelerated Genetic Algorithm
Authors:
Shu Guo,
Lin Pang
Abstract:
We develop an accelerated Genetic Algorithm (GA) system constructed by the cooperation of field-programmable gate array (FPGA) and optimized parameters of the GA. We found the enhanced decay of mutation rate makes convergence of the GA much faster, enabling the parameter-induced acceleration of the GA. Furthermore, the accelerated configuration of the GA is programmed in FPGA to boost processing s…
▽ More
We develop an accelerated Genetic Algorithm (GA) system constructed by the cooperation of field-programmable gate array (FPGA) and optimized parameters of the GA. We found the enhanced decay of mutation rate makes convergence of the GA much faster, enabling the parameter-induced acceleration of the GA. Furthermore, the accelerated configuration of the GA is programmed in FPGA to boost processing speed at the hardware level without external computation devices. This system has ability to focus light through scattering medium within 4 seconds with robust noise resistance and stable repetition performance, which could be further reduced to millisecond level with advanced board configuration. This study solves the long-term limitation of the GA, it promotes the applications of the GA in dynamic scattering mediums, with the capability to tackle wavefront sha** in biological material.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
Uncertainty Calibration for Ensemble-Based Debiasing Methods
Authors:
Ruibin Xiong,
Yimeng Chen,
Liang Pang,
Xueqi Chen,
Yanyan Lan
Abstract:
Ensemble-based debiasing methods have been shown effective in mitigating the reliance of classifiers on specific dataset bias, by exploiting the output of a bias-only model to adjust the learning target. In this paper, we focus on the bias-only model in these ensemble-based methods, which plays an important role but has not gained much attention in the existing literature. Theoretically, we prove…
▽ More
Ensemble-based debiasing methods have been shown effective in mitigating the reliance of classifiers on specific dataset bias, by exploiting the output of a bias-only model to adjust the learning target. In this paper, we focus on the bias-only model in these ensemble-based methods, which plays an important role but has not gained much attention in the existing literature. Theoretically, we prove that the debiasing performance can be damaged by inaccurate uncertainty estimations of the bias-only model. Empirically, we show that existing bias-only models fall short in producing accurate uncertainty estimations. Motivated by these findings, we propose to conduct calibration on the bias-only model, thus achieving a three-stage ensemble-based debiasing framework, including bias modeling, model calibrating, and debiasing. Experimental results on NLI and fact verification tasks show that our proposed three-stage debiasing framework consistently outperforms the traditional two-stage one in out-of-distribution accuracy.
△ Less
Submitted 7 November, 2021;
originally announced November 2021.
-
A New Sequential Optimality Condition of Cardinality-Constrained Optimization Problems and Application
Authors:
Li** Pang,
Menglong Xue,
Na Xu
Abstract:
In this paper, we consider the cardinality-constrained optimization problems and propose a new sequential optimality condition for the continuous relaxation reformulation which is popular recently. It is stronger than the existing results and is still a first-order necessity condition for the cardinality constraint problems without any additional assumptions. Meanwhile, we provide a problem-tailor…
▽ More
In this paper, we consider the cardinality-constrained optimization problems and propose a new sequential optimality condition for the continuous relaxation reformulation which is popular recently. It is stronger than the existing results and is still a first-order necessity condition for the cardinality constraint problems without any additional assumptions. Meanwhile, we provide a problem-tailored weaker constraint qualification, which can guarantee that new sequential conditions are Mordukhovich-type stationary points. On the other hand, we improve the theoretical results of the augmented Lagrangian algorithm. Under the same condition as the existing results, we prove that any feasible accumulation point of the iterative sequence generated by the algorithm satisfies the new sequence optimality condition. Furthermore, the algorithm can converge to the Mordukhovich-type (essentially strong) stationary point if the problem-tailored constraint qualification is satisfied.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Design and Evaluate Recomposited OR-AND-XOR-PUF
Authors:
Jianrong Yao,
Lihui Pang,
Zhi Zhang,
Wei Yang,
Anmin Fu,
Yansong Gao
Abstract:
Physical Unclonable Function (PUF) is a hardware security primitive with a desirable feature of low-cost. Based on the space of challenge-response pairs (CRPs), it has two categories:weak PUF and strong PUF. Though designing a reliable and secure lightweight strong PUF is challenging, there is continuing efforts to fulfill this gap due to wide range of applications enabled by strong PUF. It was pr…
▽ More
Physical Unclonable Function (PUF) is a hardware security primitive with a desirable feature of low-cost. Based on the space of challenge-response pairs (CRPs), it has two categories:weak PUF and strong PUF. Though designing a reliable and secure lightweight strong PUF is challenging, there is continuing efforts to fulfill this gap due to wide range of applications enabled by strong PUF. It was prospected that the combination of MAX and MIN bit-wise operation is promising for improving the modeling resilience when MAX and MIN are employed in the PUF recomposition. The main rationale lies on the fact that each bit-wise might be mainly vulnerable to one specific type of modeling attack, combining them can have an improved holistic resilience. This work is to first evaluate the main PUF performance, in particular,uniformity and reliability of the OR-AND-XOR-PUF(OAX-PUF)-(x, y, z)-OAX-PUF. Compared with the most used l-XOR-PUF, the (x, y, z)-OAX-PUF eventually exhibits better reliability given l=x+y+z without degrading the uniformity retaining to be 50%. We further examine the modeling resilience of the (x, y, z)-OAX-PUF with four powerful attacking strategies to date, which are Logistic Regression (LR) attack, reliability assisted CMA-ES attack, multilayer perceptron (MLP) attack, and the most recent hybrid LR-reliability attack. In comparison with the XOR-APUF, the OAX-APUF successfully defeats the CAM-ES attack. However, it shows no notable modeling accuracy drop against other three attacks, though the attacking times have been greatly prolonged to LR and hybrid LR-reliability attacks. Overall, the OAX recomposition could be an alternative lightweight recomposition method compared to XOR towards constructing strong PUFs if the underlying PUF, e.g., FF-APUF, has exhibited improved resilience to modeling attack, because the OAX incurs smaller reliability degradation compared to XOR.
△ Less
Submitted 25 April, 2022; v1 submitted 2 October, 2021;
originally announced October 2021.
-
Transductive Learning for Unsupervised Text Style Transfer
Authors:
Fei Xiao,
Liang Pang,
Yanyan Lan,
Yan Wang,
Huawei Shen,
Xueqi Cheng
Abstract:
Unsupervised style transfer models are mainly based on an inductive learning approach, which represents the style as embeddings, decoder parameters, or discriminator parameters and directly applies these general rules to the test cases. However, the lacking of parallel corpus hinders the ability of these inductive learning methods on this task. As a result, it is likely to cause severe inconsisten…
▽ More
Unsupervised style transfer models are mainly based on an inductive learning approach, which represents the style as embeddings, decoder parameters, or discriminator parameters and directly applies these general rules to the test cases. However, the lacking of parallel corpus hinders the ability of these inductive learning methods on this task. As a result, it is likely to cause severe inconsistent style expressions, like `the salad is rude`. To tackle this problem, we propose a novel transductive learning approach in this paper, based on a retrieval-based context-aware style representation. Specifically, an attentional encoder-decoder with a retriever framework is utilized. It involves top-K relevant sentences in the target style in the transfer process. In this way, we can learn a context-aware style embedding to alleviate the above inconsistency problem. In this paper, both sparse (BM25) and dense retrieval functions (MIPS) are used, and two objective functions are designed to facilitate joint learning. Experimental results show that our method outperforms several strong baselines. The proposed transductive learning approach is general and effective to the task of unsupervised style transfer, and we will apply it to the other two typical methods in the future.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Adaptive Information Seeking for Open-Domain Question Answering
Authors:
Yunchang Zhu,
Liang Pang,
Yanyan Lan,
Huawei Shen,
Xueqi Cheng
Abstract:
Information seeking is an essential step for open-domain question answering to efficiently gather evidence from a large corpus. Recently, iterative approaches have been proven to be effective for complex questions, by recursively retrieving new evidence at each step. However, almost all existing iterative approaches use predefined strategies, either applying the same retrieval function multiple ti…
▽ More
Information seeking is an essential step for open-domain question answering to efficiently gather evidence from a large corpus. Recently, iterative approaches have been proven to be effective for complex questions, by recursively retrieving new evidence at each step. However, almost all existing iterative approaches use predefined strategies, either applying the same retrieval function multiple times or fixing the order of different retrieval functions, which cannot fulfill the diverse requirements of various questions. In this paper, we propose a novel adaptive information-seeking strategy for open-domain question answering, namely AISO. Specifically, the whole retrieval and answer process is modeled as a partially observed Markov decision process, where three types of retrieval operations (e.g., BM25, DPR, and hyperlink) and one answer operation are defined as actions. According to the learned policy, AISO could adaptively select a proper retrieval action to seek the missing evidence at each step, based on the collected evidence and the reformulated query, or directly output the answer when the evidence set is sufficient for the question. Experiments on SQuAD Open and HotpotQA fullwiki, which serve as single-hop and multi-hop open-domain QA benchmarks, show that AISO outperforms all baseline methods with predefined strategies in terms of both retrieval and answer evaluations.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Toward the Understanding of Deep Text Matching Models for Information Retrieval
Authors:
Lijuan Chen,
Yanyan Lan,
Liang Pang,
Jiafeng Guo,
Xueqi Cheng
Abstract:
Semantic text matching is a critical problem in information retrieval. Recently, deep learning techniques have been widely used in this area and obtained significant performance improvements. However, most models are black boxes and it is hard to understand what happened in the matching process, due to the poor interpretability of deep learning. This paper aims at tackling this problem. The key id…
▽ More
Semantic text matching is a critical problem in information retrieval. Recently, deep learning techniques have been widely used in this area and obtained significant performance improvements. However, most models are black boxes and it is hard to understand what happened in the matching process, due to the poor interpretability of deep learning. This paper aims at tackling this problem. The key idea is to test whether existing deep text matching methods satisfy some fundamental heuristics in information retrieval. Specifically, four heuristics are used in our study, i.e., term frequency constraint, term discrimination constraint, length normalization constraints, and TF-length constraint. Since deep matching models usually contain many parameters, it is difficult to conduct a theoretical study for these complicated functions. In this paper, We propose an empirical testing method. Specifically, We first construct some queries and documents to make them satisfy the assumption in a constraint, and then test to which extend a deep text matching model trained on the original dataset satisfies the corresponding constraint. Besides, a famous attribution based interpretation method, namely integrated gradient, is adopted to conduct detailed analysis and guide for feasible improvement. Experimental results on LETOR 4.0 and MS Marco show that all the investigated deep text matching methods, both representation and interaction based methods, satisfy the above constraints with high probabilities in statistics. We further extend these constraints to the semantic settings, which are shown to be better satisfied for all the deep text matching models. These empirical findings give clear understandings on why deep text matching models usually perform well in information retrieval. We believe the proposed evaluation methodology will be useful for testing future deep text matching models.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Modeling Relevance Ranking under the Pre-training and Fine-tuning Paradigm
Authors:
Lin Bo,
Liang Pang,
Gang Wang,
Jun Xu,
XiuQiang He,
Ji-Rong Wen
Abstract:
Recently, pre-trained language models such as BERT have been applied to document ranking for information retrieval, which first pre-train a general language model on an unlabeled large corpus and then conduct ranking-specific fine-tuning on expert-labeled relevance datasets. Ideally, an IR system would model relevance from a user-system dualism: the user's view and the system's view. User's view j…
▽ More
Recently, pre-trained language models such as BERT have been applied to document ranking for information retrieval, which first pre-train a general language model on an unlabeled large corpus and then conduct ranking-specific fine-tuning on expert-labeled relevance datasets. Ideally, an IR system would model relevance from a user-system dualism: the user's view and the system's view. User's view judges the relevance based on the activities of "real users" while the system's view focuses on the relevance signals from the system side, e.g., from the experts or algorithms, etc. Inspired by the user-system relevance views and the success of pre-trained language models, in this paper we propose a novel ranking framework called Pre-Rank that takes both user's view and system's view into consideration, under the pre-training and fine-tuning paradigm. Specifically, to model the user's view of relevance, Pre-Rank pre-trains the initial query-document representations based on large-scale user activities data such as the click log. To model the system's view of relevance, Pre-Rank further fine-tunes the model on expert-labeled relevance data. More importantly, the pre-trained representations, are fine-tuned together with handcrafted learning-to-rank features under a wide and deep network architecture. In this way, Pre-Rank can model the relevance by incorporating the relevant knowledge and signals from both real search users and the IR experts. To verify the effectiveness of Pre-Rank, we showed two implementations by using BERT and SetRank as the underlying ranking model, respectively. Experimental results base on three publicly available benchmarks showed that in both of the implementations, Pre-Rank can respectively outperform the underlying ranking models and achieved state-of-the-art performances.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
OHPL: One-shot Hand-eye Policy Learner
Authors:
Changjae Oh,
Yik Lung Pang,
Andrea Cavallaro
Abstract:
The control of a robot for manipulation tasks generally relies on object detection and pose estimation. An attractive alternative is to learn control policies directly from raw input data. However, this approach is time-consuming and expensive since learning the policy requires many trials with robot actions in the physical environment. To reduce the training cost, the policy can be learned in sim…
▽ More
The control of a robot for manipulation tasks generally relies on object detection and pose estimation. An attractive alternative is to learn control policies directly from raw input data. However, this approach is time-consuming and expensive since learning the policy requires many trials with robot actions in the physical environment. To reduce the training cost, the policy can be learned in simulation with a large set of synthetic images. The limit of this approach is the domain gap between the simulation and the robot workspace. In this paper, we propose to learn a policy for robot reaching movements from a single image captured directly in the robot workspace from a camera placed on the end-effector (a hand-eye camera). The idea behind the proposed policy learner is that view changes seen from the hand-eye camera produced by actions in the robot workspace are analogous to locating a region-of-interest in a single image by performing sequential object localisation. This similar view change enables training of object reaching policies using reinforcement-learning-based sequential object localisation. To facilitate the adaptation of the policy to view changes in the robot workspace, we further present a dynamic filter that learns to bias an input state to remove irrelevant information for an action decision. The proposed policy learner can be used as a powerful representation for robotic tasks, and we validate it on static and moving object reaching tasks.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
Probing criticality with deep learning in relativistic heavy-ion collisions
Authors:
Yige Huang,
Long-Gang Pang,
Xiaofeng Luo,
Xin-Nian Wang
Abstract:
Systems with different interactions could develop the same critical behaviour due to the underlying symmetry and universality. Using this principle of universality, we can embed critical correlations modeled on the 3D Ising model into the simulated data of heavy-ion collisions, hiding weak signals of a few inter-particle correlations within a large particle cloud. Employing a point cloud network w…
▽ More
Systems with different interactions could develop the same critical behaviour due to the underlying symmetry and universality. Using this principle of universality, we can embed critical correlations modeled on the 3D Ising model into the simulated data of heavy-ion collisions, hiding weak signals of a few inter-particle correlations within a large particle cloud. Employing a point cloud network with dynamical edge convolution, we are able to identify events with critical fluctuations through supervised learning, and pick out a large fraction of signal particles used for decision-making in each single event.
△ Less
Submitted 2 March, 2022; v1 submitted 25 July, 2021;
originally announced July 2021.
-
(3+1)-D viscous hydrodynamics CLVisc at finite net baryon density: identified particle spectra, anisotropic flows and flow fluctuations across BES energies
Authors:
Xiang-Yu Wu,
Guang-You Qin,
Long-Gang Pang,
Xin-Nian Wang
Abstract:
To study the bulk properties of the quark-gluon-plasma (QGP) produced at the beam energy scan (BES) energies at the Relativistic Heavy Ion Collider (RHIC), we extend the (3+1)-dimensional viscous hydrodynamics CLVisc to include net baryon number conservation and Israel-Stewart-like equation for baryon diffusion with the NEOS-BQS equation of state, fluctuating initial conditions from Monte-Carlo Gl…
▽ More
To study the bulk properties of the quark-gluon-plasma (QGP) produced at the beam energy scan (BES) energies at the Relativistic Heavy Ion Collider (RHIC), we extend the (3+1)-dimensional viscous hydrodynamics CLVisc to include net baryon number conservation and Israel-Stewart-like equation for baryon diffusion with the NEOS-BQS equation of state, fluctuating initial conditions from Monte-Carlo Glauber model, and the afterburner SMASH. This integrated framework is shown to provide a good description of identified particle spectra, mean transverse momenta and anisotropic flows for different centralities and over a wide range of collision energies (7.7-62.4 GeV). It is found that the mean momenta of identified particles and anisotropic flows increases mildly with the collision energy due to larger radial flow. We further compute the multiple-particle cumulant ratio $v_2\{4\}/v_2\{2\}$ of elliptic flow across BES energies, and find that the relative fluctuations of elliptic flow are insensitive to the collision energy, consistent with the preliminary STAR data. Our model provides a benchmark for understanding the RHIC-BES data and studying the critical properties and phase structure of hot and dense QCD matter.
△ Less
Submitted 30 March, 2022; v1 submitted 10 July, 2021;
originally announced July 2021.
-
Towards safe human-to-robot handovers of unknown containers
Authors:
Yik Lung Pang,
Alessio Xompero,
Changjae Oh,
Andrea Cavallaro
Abstract:
Safe human-to-robot handovers of unknown objects require accurate estimation of hand poses and object properties, such as shape, trajectory, and weight. Accurately estimating these properties requires the use of scanned 3D object models or expensive equipment, such as motion capture systems and markers, or both. However, testing handover algorithms with robots may be dangerous for the human and, w…
▽ More
Safe human-to-robot handovers of unknown objects require accurate estimation of hand poses and object properties, such as shape, trajectory, and weight. Accurately estimating these properties requires the use of scanned 3D object models or expensive equipment, such as motion capture systems and markers, or both. However, testing handover algorithms with robots may be dangerous for the human and, when the object is an open container with liquids, for the robot. In this paper, we propose a real-to-simulation framework to develop safe human-to-robot handovers with estimations of the physical properties of unknown cups or drinking glasses and estimations of the human hands from videos of a human manipulating the container. We complete the handover in simulation, and we estimate a region that is not occluded by the hand of the human holding the container. We also quantify the safeness of the human and object in simulation. We validate the framework using public recordings of containers manipulated before a handover and show the safeness of the handover when using noisy estimates from a range of perceptual algorithms.
△ Less
Submitted 2 July, 2021;
originally announced July 2021.
-
Sketch and Customize: A Counterfactual Story Generator
Authors:
Changying Hao,
Liang Pang,
Yanyan Lan,
Yan Wang,
Jiafeng Guo,
Xueqi Cheng
Abstract:
Recent text generation models are easy to generate relevant and fluent text for the given text, while lack of causal reasoning ability when we change some parts of the given text. Counterfactual story rewriting is a recently proposed task to test the causal reasoning ability for text generation models, which requires a model to predict the corresponding story ending when the condition is modified…
▽ More
Recent text generation models are easy to generate relevant and fluent text for the given text, while lack of causal reasoning ability when we change some parts of the given text. Counterfactual story rewriting is a recently proposed task to test the causal reasoning ability for text generation models, which requires a model to predict the corresponding story ending when the condition is modified to a counterfactual one. Previous works have shown that the traditional sequence-to-sequence model cannot well handle this problem, as it often captures some spurious correlations between the original and counterfactual endings, instead of the causal relations between conditions and endings. To address this issue, we propose a sketch-and-customize generation model guided by the causality implicated in the conditions and endings. In the sketch stage, a skeleton is extracted by removing words which are conflict to the counterfactual condition, from the original ending. In the customize stage, a generation model is used to fill proper words in the skeleton under the guidance of the counterfactual condition. In this way, the obtained counterfactual ending is both relevant to the original ending and consistent with the counterfactual condition. Experimental results show that the proposed model generates much better endings, as compared with the traditional sequence-to-sequence model.
△ Less
Submitted 2 April, 2021;
originally announced April 2021.