Search | arXiv e-print repository

arXiv:2405.19853 [pdf]

Correlated Electronic Structure and Density-Wave Gap in Trilayer Nickelate La4Ni3O10

Authors: X. Du, Y. D. Li, Y. T. Cao, C. Y. Pei, M. X. Zhang, W. X. Zhao, K. Y. Zhai, R. Z. Xu, Z. K. Liu, Z. W. Li, J. K. Zhao, G. Li, Y. L. Chen, Y. P. Qi, H. J. Guo, L. X. Yang

Abstract: The discovery of pressurized superconductivity at 80 K in La3Ni2O7 officially brings nickelates into the family of high-temperature superconductors, which gives rise to not only new insights but also mysteries in the strongly correlated superconductivity. More recently, the sibling compound La4Ni3O10 was also shown to be superconducting below about 25 K under pressure, further boosting the popular… ▽ More The discovery of pressurized superconductivity at 80 K in La3Ni2O7 officially brings nickelates into the family of high-temperature superconductors, which gives rise to not only new insights but also mysteries in the strongly correlated superconductivity. More recently, the sibling compound La4Ni3O10 was also shown to be superconducting below about 25 K under pressure, further boosting the popularity of nickelates in the Ruddlesden-Popper phase. In this study, combining high-resolution angle-resolved photoemission spectroscopy and ab initio calculation, we systematically investigate the electronic structures of La4Ni3O10 at ambient pressure. We reveal a high resemblance of La4Ni3O10 with La3Ni2O7 in the orbital-dependent fermiology and electronic structure, suggesting a similar electronic correlation between the two compounds. The temperature-dependent measurements imply an orbital-dependent energy gap related to the density-wave transition in La4Ni3O10. By comparing the theoretical pressure-dependent electronic structure, clues about the superconducting high-pressure phase can be deduced from the ambient measurements, providing crucial information for deciphering the unconventional superconductivity in nickelates. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.15362 [pdf, other]

Pipeline Parallelism with Controllable Memory

Authors: Penghui Qi, Xinyi Wan, Nyamdavaa Amar, Min Lin

Abstract: Pipeline parallelism has been widely explored, but most existing schedules lack a systematic methodology. In this paper, we propose a framework to decompose pipeline schedules as repeating a building block and we show that the lifespan of the building block decides the peak activation memory of the pipeline schedule. Guided by the observations, we find that almost all existing pipeline schedules,… ▽ More Pipeline parallelism has been widely explored, but most existing schedules lack a systematic methodology. In this paper, we propose a framework to decompose pipeline schedules as repeating a building block and we show that the lifespan of the building block decides the peak activation memory of the pipeline schedule. Guided by the observations, we find that almost all existing pipeline schedules, to the best of our knowledge, are memory inefficient. To address this, we introduce a family of memory efficient building blocks with controllable activation memory, which can reduce the peak activation memory to 1/2 of 1F1B without sacrificing efficiency, and even to 1/3 with comparable throughput. We can also achieve almost zero pipeline bubbles while maintaining the same activation memory as 1F1B. Our evaluations demonstrate that in pure pipeline parallelism settings, our methods outperform 1F1B by from 7% to 55% in terms of throughput. When employing a grid search over hybrid parallelism hyperparameters in practical scenarios, our proposed methods demonstrate a 16% throughput improvement over the 1F1B baseline for large language models. △ Less

Submitted 10 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

arXiv:2403.18884 [pdf, other]

doi 10.1364/OPTICA.524812

Nanoscale Dark-Field Imaging in Full-Field Transmission X-Ray Microscopy

Authors: Sami Wirtensohn, Peng Qi, Christan David, Julia Herzen, Imke Greving, Silja Flenner

Abstract: The dark-field signal uncovers details beyond conventional X-ray attenuation contrast, which is especially valuable for material sciences. In particular, dark-field techniques are able to reveal structures beyond the spatial resolution of a setup. However, its implementation is yet limited to the micrometer regime. Therefore, we propose a technique to extend full-field transmission X-ray microscop… ▽ More The dark-field signal uncovers details beyond conventional X-ray attenuation contrast, which is especially valuable for material sciences. In particular, dark-field techniques are able to reveal structures beyond the spatial resolution of a setup. However, its implementation is yet limited to the micrometer regime. Therefore, we propose a technique to extend full-field transmission X-ray microscopy by the dark-field signal. The proposed method is based on a well-defined illumination of a beam-sha** condenser, which allows to block the bright-field by motorized apertures in the back focal plane of the objective's lens. This method offers a simple implementation and enables rapid modality changes while maintaining short scan times, making dark-field imaging widely available at the nanometer scale. △ Less

Submitted 3 May, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: Updated version; revised according to the comments from the peer review. There were only minor changes. In general we mainly added more detailed explanations. We changed the title, extended sections 2.1., 2.2., 2.3. and 3., extended Fig. 1 to better illustrate the differences between the two imaging modes, and made some slight adjustments in section 4. for clarifications

arXiv:2403.05012 [pdf]

Ultrafast Dynamics of Bilayer and Trilayer Nickelate Superconductors

Authors: Y. D. Li, Y. T. Cao, L. Y. Liu, P. Peng, H. Lin, C. Y. Pei, M. X. Zhang, H. Wu, X. Du, W. X. Zhao, K. Y. Zhai, J. K. Zhao, M. -L. Lin, P. H. Tan, Y. P. Qi, G. Li, H. J. Guo, Luyi Yang, L. X. Yang

Abstract: In addition to the pressurized high-temperature superconductivity, bilayer and trilayer nickelate superconductors Lan+1NinO3n+1 (n = 2 and 3) exhibit many intriguing properties at ambient pressure, such as orbital-dependent electronic correlation, non-Fermi liquid behavior, and density-wave transitions. Here, using ultrafast reflectivity measurement, we observe a drastic difference between the ult… ▽ More In addition to the pressurized high-temperature superconductivity, bilayer and trilayer nickelate superconductors Lan+1NinO3n+1 (n = 2 and 3) exhibit many intriguing properties at ambient pressure, such as orbital-dependent electronic correlation, non-Fermi liquid behavior, and density-wave transitions. Here, using ultrafast reflectivity measurement, we observe a drastic difference between the ultrafast dynamics of the bilayer and trilayer nickelates at ambient pressure. Firstly, we observe a coherent phonon mode in La4Ni3O10 involving the collective vibration of La, Ni, and O atoms, which is absent in La3Ni2O7. Secondly, the temperature-dependent relaxation time diverges near the density-wave transition temperature of La4Ni3O10, in drastic contrast to kink-like changes in La3Ni2O7. Moreover, we estimate the electron-phonon coupling constants to be 0.05~0.07 and 0.12~0.16 for La3Ni2O7 and La4Ni3O10, respectively, suggesting a relatively minor role of electron-phonon coupling in the electronic properties of Lan+1NinO3n+1. Our work not only sheds light on the relevant microscopic interaction but also establishes a foundation for further studying the interplay between superconductivity and density-wave transitions in nickelate superconductors. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.03170 [pdf, other]

SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection

Authors: Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee

Abstract: Misinformation is a prevalent societal issue due to its potential high risks. Out-of-context (OOC) misinformation, where authentic images are repurposed with false text, is one of the easiest and most effective ways to mislead audiences. Current methods focus on assessing image-text consistency but lack convincing explanations for their judgments, which is essential for debunking misinformation. W… ▽ More Misinformation is a prevalent societal issue due to its potential high risks. Out-of-context (OOC) misinformation, where authentic images are repurposed with false text, is one of the easiest and most effective ways to mislead audiences. Current methods focus on assessing image-text consistency but lack convincing explanations for their judgments, which is essential for debunking misinformation. While Multimodal Large Language Models (MLLMs) have rich knowledge and innate capability for visual reasoning and explanation generation, they still lack sophistication in understanding and discovering the subtle crossmodal differences. In this paper, we introduce SNIFFER, a novel multimodal large language model specifically engineered for OOC misinformation detection and explanation. SNIFFER employs two-stage instruction tuning on InstructBLIP. The first stage refines the model's concept alignment of generic objects with news-domain entities and the second stage leverages language-only GPT-4 generated OOC-specific instruction data to fine-tune the model's discriminatory powers. Enhanced by external tools and retrieval, SNIFFER not only detects inconsistencies between text and image but also utilizes external knowledge for contextual verification. Our experiments show that SNIFFER surpasses the original MLLM by over 40% and outperforms state-of-the-art methods in detection accuracy. SNIFFER also provides accurate and persuasive explanations as validated by quantitative and human evaluations. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: To appear in CVPR 2024

arXiv:2403.01203 [pdf, other]

Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment

Authors: Luyao Wang, Pengnian Qi, Xigang Bao, Chunlai Zhou, Biao Qin

Abstract: Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a… ▽ More Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way. Specifically, in order to generate holistic entity representations, we first devise various embedding modules and attention mechanisms to extract visual, structural, relational, and attribute features. Different from the prior direct fusion methods, we next propose to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality. Then, we combine pseudo-label calibration with momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer. Finally, extensive experiments on two MMEA datasets demonstrate the effectiveness of our PCMEA, which yields state-of-the-art performance. △ Less

Submitted 2 March, 2024; originally announced March 2024.

Comments: accepted by AAAI2024

arXiv:2402.06028 [pdf, ps, other]

Iwasawa $λ$ invariant and Massey product

Authors: Peikai Qi

Abstract: We compute Iwasawa $λ$ invariant in terms of Massey products in Galois cohomology with restricted ramification. When applied to imaginary quadratic fields and cyclotomic fields, we obtain a new proof and generalization of results of Gold and McCallum-Sharifi. The main tool is the generalized Bockstein map introduced by Lam-Liu-Sharifi-Wake-Wang. We compute Iwasawa $λ$ invariant in terms of Massey products in Galois cohomology with restricted ramification. When applied to imaginary quadratic fields and cyclotomic fields, we obtain a new proof and generalization of results of Gold and McCallum-Sharifi. The main tool is the generalized Bockstein map introduced by Lam-Liu-Sharifi-Wake-Wang. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 32 pages

arXiv:2401.10241 [pdf, other]

Zero Bubble Pipeline Parallelism

Authors: Penghui Qi, Xinyi Wan, Guangxing Huang, Min Lin

Abstract: Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable. In this work, we introduce a scheduling strategy that, to our knowledge, is the first to successfully achieve zero pipeline bubbles under synchronous training semantics. The key idea behind this improvement is to split the backward c… ▽ More Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable. In this work, we introduce a scheduling strategy that, to our knowledge, is the first to successfully achieve zero pipeline bubbles under synchronous training semantics. The key idea behind this improvement is to split the backward computation into two parts, one that computes gradient for the input and another that computes for the parameters. Based on this idea, we handcraft novel pipeline schedules that significantly outperform the baseline methods. We further develop an algorithm that automatically finds an optimal schedule based on specific model configuration and memory limit. Additionally, to truly achieve zero bubble, we introduce a novel technique to bypass synchronizations during the optimizer step. Experimental evaluations show that our method outperforms the 1F1B schedule up to 23% in throughput under a similar memory limit. This number can be further pushed to 31% when the memory constraint is relaxed. We believe our results mark a major step forward in harnessing the true potential of pipeline parallelism. We open sourced our implementation based on the popular Megatron-LM repository on https://github.com/sail-sg/zero-bubble-pipeline-parallelism. △ Less

Submitted 30 November, 2023; originally announced January 2024.

arXiv:2309.12247 [pdf, other]

doi 10.1609/aaai.v38i20.30214

Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection

Authors: Beizhe Hu, Qiang Sheng, Juan Cao, Yuhui Shi, Yang Li, Danding Wang, Peng Qi

Abstract: Detecting fake news requires both a delicate sense of diverse clues and a profound understanding of the real-world background, which remains challenging for detectors based on small language models (SLMs) due to their knowledge and capability limitations. Recent advances in large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with fak… ▽ More Detecting fake news requires both a delicate sense of diverse clues and a profound understanding of the real-world background, which remains challenging for detectors based on small language models (SLMs) due to their knowledge and capability limitations. Recent advances in large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with fake news detection remains underexplored. In this paper, we investigate the potential of LLMs in fake news detection. First, we conduct an empirical study and find that a sophisticated LLM such as GPT 3.5 could generally expose fake news and provide desirable multi-perspective rationales but still underperforms the basic SLM, fine-tuned BERT. Our subsequent analysis attributes such a gap to the LLM's inability to select and integrate rationales properly to conclude. Based on these findings, we propose that current LLMs may not substitute fine-tuned SLMs in fake news detection but can be a good advisor for SLMs by providing multi-perspective instructive rationales. To instantiate this proposal, we design an adaptive rationale guidance network for fake news detection (ARG), in which SLMs selectively acquire insights on news analysis from the LLMs' rationales. We further derive a rationale-free version of ARG by distillation, namely ARG-D, which services cost-sensitive scenarios without querying LLMs. Experiments on two real-world datasets demonstrate that ARG and ARG-D outperform three types of baseline methods, including SLM-based, LLM-based, and combinations of small and large language models. △ Less

Submitted 22 January, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: 16 pages, 5 figures, and 9 tables. To appear at AAAI 2024

Journal ref: AAAI 2024

arXiv:2309.08123 [pdf, ps, other]

Multivariate Fibonacci-like Polynomials and their Applications

Authors: Se** Park, Etienne Phillips, Peikai Qi, Ilir Ziba, Zhan Zhan

Abstract: The Fibonacci polynomials are defined recursively as $f_{n}(x)=xf_{n-1}(x)+f_{n-2}(x)$, where $f_0(x) = 0$ and $f_1(x)= 1$. We generalize these polynomials to an arbitrary number of variables with the $r$-Fibonacci polynomial. We extend several well-known results such as the explicit Binet formula and a Cassini-like identity, and use these to prove that the $r$-Fibonacci polynomials are irreducibl… ▽ More The Fibonacci polynomials are defined recursively as $f_{n}(x)=xf_{n-1}(x)+f_{n-2}(x)$, where $f_0(x) = 0$ and $f_1(x)= 1$. We generalize these polynomials to an arbitrary number of variables with the $r$-Fibonacci polynomial. We extend several well-known results such as the explicit Binet formula and a Cassini-like identity, and use these to prove that the $r$-Fibonacci polynomials are irreducible over $\mathbb{C}$ for $n \geq r \geq 3$. Additionally, we derive an explicit sum formula and a generalized generating function. Using these results, we establish connections to ordinary Bell polynomials, exponential Bell polynomials, Fubini numbers, and integer and set partitions. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: 15 pages. Written and edited by Se** Park, Etienne Phillips, Ilir Ziba, and Zhan Zhan

MSC Class: 11B39

arXiv:2307.04965 [pdf]

Acoustic diagnostics of femtosecond laser filamentation

Authors: Binpeng Shang, Nan Zhang, Pengfei Qi, Shishi Tao, Lie Lin, Weiwei Liu

Abstract: The promising application of femtosecond laser filamentation in atmospheric remote sensing brings imperative demand for diagnosing the spatiotemporal dynamics of filamentation. Acoustic emission (AE) during filamentation opens a door to give the insight into the dynamic evolution of filaments in air. In particular, the frequency features of the acoustic emission provide relevant information on the… ▽ More The promising application of femtosecond laser filamentation in atmospheric remote sensing brings imperative demand for diagnosing the spatiotemporal dynamics of filamentation. Acoustic emission (AE) during filamentation opens a door to give the insight into the dynamic evolution of filaments in air. In particular, the frequency features of the acoustic emission provide relevant information on the conversion of laser energy to acoustic energy. Here, the acoustic emission of femtosecond laser filament manipulated by energy and the focal lengths was measured quantitatively by a broadband microphone, and the acoustic parameters were compared and analyzed. Our results showed that the acoustic power presents a squared dependence on the laser energy and the bandwidth of the acoustic spectrum showed a significant positive correlation with laser energy deposition. It was found that the spectrum of the acoustic pulse emitted from the middle of the filament has a larger bandwidth compared to those emitted from the ends of the filament and the spectrum of the acoustic pulse is also an indicator of the filament intensity distribution. These findings are helpful for studying the plasma filament properties and complex dynamic processes through acoustic parameters and allow the optimization of remote applications. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: 8 pages,5 figures

MSC Class: 78A60 ACM Class: J.2.9

arXiv:2307.00742 [pdf, other]

Nanotips for 0.5THz scattering scanning near field microscopy

Authors: Zeliang Zhang, Pengfei Qi, Olga Kosavera, Cheng Gong, Lie Lin, Weiwei Liu

Abstract: This manuscript demonstrates the theory, design, and simulation of scattering scanning near field microscopy (s-SNOM) in 0.5THz. A comprehensive simulation model of nanotips' geometry, sample materials, and incident field is established to significantly improve the scattering efficiency and spatial resolution to achieve optimal performance. The theoretical model is based on full-wave simulation an… ▽ More This manuscript demonstrates the theory, design, and simulation of scattering scanning near field microscopy (s-SNOM) in 0.5THz. A comprehensive simulation model of nanotips' geometry, sample materials, and incident field is established to significantly improve the scattering efficiency and spatial resolution to achieve optimal performance. The theoretical model is based on full-wave simulation and dipole moment analysis which can describe the overall nanotip's geometry information to screen the optimal parameters corresponding to the 0.5THz, which is the center frequency of most THz sources. The customized nanotip can achieve 40 nm $(λ/ 15000)$ spatial resolution while maintaining an excellent scattering efficiency. This nanotip design method doesn't depend on the homogenization commercial AFM tips, providing an approach for customized nanotip design of THz wave scattering near field microscopy. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: 11page, 8figures

arXiv:2306.15150 [pdf]

Femtosecond Laser Filamentation in Atmospheric Turbulence

Authors: Jiewei Guo, Lu Sun, Yuezheng Wang, Jiayun Xue, Zhi Zhang, Haiyi Liu, Shishi Tao, Pengfei Qi, Lie Lin, Weiwei Liu

Abstract: The effects of turbulence intensity and turbulence region on the distribution of femtosecond laser filaments are experimentally elaborated. Through the ultrasonic signals emitted by the filaments, and it is observed that increasing turbulence intensity and expanding turbulence active region cause an increase in the start position of the filament, and a decrease in filament length, which can be wel… ▽ More The effects of turbulence intensity and turbulence region on the distribution of femtosecond laser filaments are experimentally elaborated. Through the ultrasonic signals emitted by the filaments, and it is observed that increasing turbulence intensity and expanding turbulence active region cause an increase in the start position of the filament, and a decrease in filament length, which can be well explained by the theoretical calculation. It is also observed that the random perturbation of the air refractive index caused by atmospheric turbulence expanded the spot size of the filament. Additionally, when turbulence intensity reaches , multiple filaments are formed. Furthermore, the standard deviation of the transverse displacement of filament is found to be proportional to the square root of turbulent structure constant under the experimental turbulence parameters in this paper. These results contribute to the study of femtosecond laser propagation mechanisms in complex atmospheric turbulence conditions △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: 9 pages, 4 figures

arXiv:2306.12904 [pdf]

Coupled air lasing gain and Mie scattering loss: aerosol effect in filament-induced plasma spectroscopy

Authors: Jiayun Xue, Zhi Zhang, Yuezheng Wang, Binpeng Shang, Jiewei Guo, Shishi Tao, Nan Zhang, Lanjunguo, Pengfei Qi, Lie Lin, Weiwei Liu

Abstract: Femtosecond laser filament-induced plasma spectroscopy (FIPS) demonstrates great potentials in the remote sensing for identifying atmospheric pollutant molecules. Due to the widespread aerosols in atmosphere, the remote detection based on FIPS would be affected from both the excitation and the propagation of fingerprint fluorescence, which still remain elusive. Here the physical model of filament-… ▽ More Femtosecond laser filament-induced plasma spectroscopy (FIPS) demonstrates great potentials in the remote sensing for identifying atmospheric pollutant molecules. Due to the widespread aerosols in atmosphere, the remote detection based on FIPS would be affected from both the excitation and the propagation of fingerprint fluorescence, which still remain elusive. Here the physical model of filament-induced aerosol fluorescence is established to reveal the combined effect of Mie scattering and amplification spontaneous emission, which is then proved by the experimental results, the dependence of the backward fluorescence on the interaction length between filament and aerosols. These findings provide an insight into the complicated aerosol effect in the overall physical process of FIPS including propagation, excitation and emission, paving the way to its practical application in atmospheric remote sensing. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 7 pages, 4 figures

arXiv:2306.09355 [pdf]

Remote Sensing of Trace Element in Sea Salt Aerosol with Sensitivity Level of 10 pg/m3

Authors: Yuezheng Wang, Nan Zhang, Jiayun Xue, Bingpeng Shang, Jiewei Guo, Zhi Zhang, Pengfei Qi, Lie Lin, Weiwei Liu

Abstract: Sea salt aerosols composed mainly of micrometer-sized sodium chloride particles not only pose a potential threat to human health and traffic safety, but also directly affect climate prediction. The long-range and high-precision sensing of sea salt aerosols remains a challenge for existing composition analysis methods. As the development of ultrashort laser technology, femtosecond laser filamentati… ▽ More Sea salt aerosols composed mainly of micrometer-sized sodium chloride particles not only pose a potential threat to human health and traffic safety, but also directly affect climate prediction. The long-range and high-precision sensing of sea salt aerosols remains a challenge for existing composition analysis methods. As the development of ultrashort laser technology, femtosecond laser filamentation provides a new opportunity for molecular remote sensing in complex environments. However, the accuracy at long-distance of this method is still hard to meet the demand (<10 ng/m3) for the remote aerosol monitoring. To solve this problem, we built a remote detection system for sea salt aerosol fluorescence spectroscopy and obtained a very high system sensitivity by introducing a terawatt-class high-performance femtosecond laser and optimizing the filament and aerosol interaction length. The system achieves a Na+ detection limit of 0.015 ng/m3 at a detection distance of 30 m, and 0.006 ng/m3 when supplemented with a deep processing learning algorithm. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 13 pages, 3 figures

arXiv:2306.07281 [pdf]

Filament based Ionizing Radiation Sensing Technology

Authors: Weiwei Liu, Jiewei Guo, Nan Zhang, Lu Sun, Haiyi Liu, Shihi Tao, Yuezheng Wang, Binpeng Shang, Pengfei Qi, Lie Lin

Abstract: Accidental exposure to overdose ionizing radiation will inevitably lead to severe biological damage, thus detecting and localizing radiation is essential. Traditional measurement techniques are generally restricted to the limited detection range of few centimeters, posing a great risk to operators. The potential in remote sensing makes femtosecond laser filament technology great candidates for con… ▽ More Accidental exposure to overdose ionizing radiation will inevitably lead to severe biological damage, thus detecting and localizing radiation is essential. Traditional measurement techniques are generally restricted to the limited detection range of few centimeters, posing a great risk to operators. The potential in remote sensing makes femtosecond laser filament technology great candidates for constructively address this challenge. Here we propose a novel filament-based ionizing radiation sensing technology (FIRST), and clarify the interaction mechanism between filaments and ionizing radiation. Specifically, it is demonstrated that the energetic electrons and ions produced by α radiation in air can be effectively accelerated within the filament, serving as seed electrons, which will markedly enhance nitrogen fluorescence. The extended nitrogen fluorescence lifetime of ~1 ns is also observed. These findings provide insights into the intricate interaction among ultra-strong light filed, plasma and energetic particle beam, and pave the way for the remote sensing of ionizing radiation. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 13 pages, 6 figures

arXiv:2306.05241 [pdf, other]

Two Heads Are Better Than One: Improving Fake News Video Detection by Correlating with Neighbors

Authors: Peng Qi, Yuyang Zhao, Yufeng Shen, Wei Ji, Juan Cao, Tat-Seng Chua

Abstract: The prevalence of short video platforms has spawned a lot of fake news videos, which have stronger propagation ability than textual fake news. Thus, automatically detecting fake news videos has been an important countermeasure in practice. Previous works commonly verify each news video individually with multimodal information. Nevertheless, news videos from different perspectives regarding the sam… ▽ More The prevalence of short video platforms has spawned a lot of fake news videos, which have stronger propagation ability than textual fake news. Thus, automatically detecting fake news videos has been an important countermeasure in practice. Previous works commonly verify each news video individually with multimodal information. Nevertheless, news videos from different perspectives regarding the same event are commonly posted together, which contain complementary or contradictory information and thus can be used to evaluate each other mutually. To this end, we introduce a new and practical paradigm, i.e., cross-sample fake news video detection, and propose a novel framework, Neighbor-Enhanced fakE news video Detection (NEED), which integrates the neighborhood relationship of new videos belonging to the same event. NEED can be readily combined with existing single-sample detectors and further enhance their performances with the proposed graph aggregation (GA) and debunking rectification (DR) modules. Specifically, given the feature representations obtained from single-sample detectors, GA aggregates the neighborhood information with the dynamic graph to enrich the features of independent samples. After that, DR explicitly leverages the relationship between debunking videos and fake news videos to refute the candidate videos via textual and visual consistency. Extensive experiments on the public benchmark demonstrate that NEED greatly improves the performance of both single-modal (up to 8.34% in accuracy) and multimodal (up to 4.97% in accuracy) base detectors. Codes are available in https://github.com/ICTMCG/NEED. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: To appear in ACL 2023 Findings

arXiv:2303.03093 [pdf, other]

A Miniaturised Camera-based Multi-Modal Tactile Sensor

Authors: Kaspar Althoefer, Yonggen Ling, Wanlin Li, Xinyuan Qian, Wang Wei Lee, Peng Qi

Abstract: In conjunction with huge recent progress in camera and computer vision technology, camera-based sensors have increasingly shown considerable promise in relation to tactile sensing. In comparison to competing technologies (be they resistive, capacitive or magnetic based), they offer super-high-resolution, while suffering from fewer wiring problems. The human tactile system is composed of various ty… ▽ More In conjunction with huge recent progress in camera and computer vision technology, camera-based sensors have increasingly shown considerable promise in relation to tactile sensing. In comparison to competing technologies (be they resistive, capacitive or magnetic based), they offer super-high-resolution, while suffering from fewer wiring problems. The human tactile system is composed of various types of mechanoreceptors, each able to perceive and process distinct information such as force, pressure, texture, etc. Camera-based tactile sensors such as GelSight mainly focus on high-resolution geometric sensing on a flat surface, and their force measurement capabilities are limited by the hysteresis and non-linearity of the silicone material. In this paper, we present a miniaturised dome-shaped camera-based tactile sensor that allows accurate force and tactile sensing in a single coherent system. The key novelty of the sensor design is as follows. First, we demonstrate how to build a smooth silicone hemispheric sensing medium with uniform markers on its curved surface. Second, we enhance the illumination of the rounded silicone with diffused LEDs. Third, we construct a force-sensitive mechanical structure in a compact form factor with usage of springs to accurately perceive forces. Our multi-modal sensor is able to acquire tactile information from multi-axis forces, local force distribution, and contact geometry, all in real-time. We apply an end-to-end deep learning method to process all the information. △ Less

Submitted 6 March, 2023; originally announced March 2023.

arXiv:2302.03242 [pdf, other]

doi 10.1145/3581783.3612426

Combating Online Misinformation Videos: Characterization, Detection, and Future Directions

Authors: Yuyan Bu, Qiang Sheng, Juan Cao, Peng Qi, Danding Wang, **tao Li

Abstract: With information consumption via online video streaming becoming increasingly popular, misinformation video poses a new threat to the health of the online information ecosystem. Though previous studies have made much progress in detecting misinformation in text and image formats, video-based misinformation brings new and unique challenges to automatic detection systems: 1) high information heterog… ▽ More With information consumption via online video streaming becoming increasingly popular, misinformation video poses a new threat to the health of the online information ecosystem. Though previous studies have made much progress in detecting misinformation in text and image formats, video-based misinformation brings new and unique challenges to automatic detection systems: 1) high information heterogeneity brought by various modalities, 2) blurred distinction between misleading video manipulation and nonmalicious artistic video editing, and 3) new patterns of misinformation propagation due to the dominant role of recommendation systems on online video platforms. To facilitate research on this challenging task, we conduct this survey to present advances in misinformation video detection. We first analyze and characterize the misinformation video from three levels including signals, semantics, and intents. Based on the characterization, we systematically review existing works for detection from features of various modalities to techniques for clue integration. We also introduce existing resources including representative datasets and useful tools. Besides summarizing existing studies, we discuss related areas and outline open issues and future directions to encourage and guide more research on misinformation video detection. The corresponding repository is at https://github.com/ICTMCG/Awesome-Misinfo-Video-Detection. △ Less

Submitted 6 August, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

Comments: Accepted at ACM Multimedia 2023 (MM 2023). 11 pages, 4 figures, and 89 references

arXiv:2212.11352 [pdf]

Sensitivity analysis of biological washout and depth selection for a machine learning based dose verification framework in proton therapy

Authors: Shixiong Yu, Yuxiang Liu, Zongsheng Hu, Haozhao Zhang, Pengyu Qi, Hao Peng

Abstract: Dose verification based on proton-induced positron emitters is a promising quality assurance tool and may leverage the strength of artificial intelligence. To move a step closer towards practical application, the sensitivity analysis of two factors needs to be performed: biological washout and depth selection. selection. A bi-directional recurrent neural network (RNN) model was developed. The trai… ▽ More Dose verification based on proton-induced positron emitters is a promising quality assurance tool and may leverage the strength of artificial intelligence. To move a step closer towards practical application, the sensitivity analysis of two factors needs to be performed: biological washout and depth selection. selection. A bi-directional recurrent neural network (RNN) model was developed. The training dataset was generated based upon a CT image-based phantom (abdomen region) and multiple beam energies/pathways, using Monte-Carlo simulation (1 mm spatial resolution, no biological washout). For the modeling of biological washout, a simplified analytical model was applied to change raw activity profiles over a period of 5 minutes, incorporating both physical decay and biological washout. For the study of depth selection (a challenge linked to multi field/angle irradiation), truncations were applied at different window lengths (100, 125, 150 mm) to raw activity profiles. Finally, the performance of a worst-case scenario was examined by combining both factors (depth selection: 125 mm, biological washout: 5 mins). The accuracy was quantitatively evaluated in terms of range uncertainty, mean absolute error (MAE) and mean relative errors (MRE). Our proposed AI framework shows good immunity to the perturbation associated with two factors. The detection of proton-induced positron emitters, combined with machine learning, has great potential to implement online patient-specific verification in proton therapy. △ Less

Submitted 21 December, 2022; originally announced December 2022.

arXiv:2212.09912 [pdf, other]

Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks

Authors: Kaiser Sun, Peng Qi, Yuhao Zhang, Lan Liu, William Yang Wang, Zhiheng Huang

Abstract: Generative models have been widely applied to solve extractive tasks, where parts of the input is extracted to form the desired output, and achieved significant success. For example, in extractive question answering (QA), generative models have constantly yielded state-of-the-art results. In this work, we identify the issue of tokenization inconsistency that is commonly neglected in training these… ▽ More Generative models have been widely applied to solve extractive tasks, where parts of the input is extracted to form the desired output, and achieved significant success. For example, in extractive question answering (QA), generative models have constantly yielded state-of-the-art results. In this work, we identify the issue of tokenization inconsistency that is commonly neglected in training these models. This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently by the tokenizer, and thus leads to performance drop as well as hallucination. We propose a simple yet effective fix to this issue and conduct a case study on extractive QA. We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1.7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets. Further, the model converges faster, and becomes less likely to generate out-of-context answers. With these findings, we would like to call for more attention on how tokenization should be done when solving extractive tasks and recommend applying consistent tokenization during training. △ Less

Submitted 24 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: Findings of EMNLP2023

arXiv:2211.11621 [pdf, other]

doi 10.1364/OE.481678

Tilting refractive x-ray lenses for fine-tuning their focal length

Authors: Rafael Celestre, Thomas Roth, Carsten Detlefs, Peng Qi, Marco Cammarata, Manuel Sanchez del Rio, Raymond Barrett

Abstract: In this work, we measure and model tilted x-ray refractive lenses to investigate their effects on an x-ray beam. The modelling is benchmarked against at-wavelength metrology obtained with x-ray speckle vector tracking experiments (XSVT) at the BM05 beamline at the ESRF-EBS light source, showing very good agreement. This validation permits us to explore possible applications of tilted x-ray lenses… ▽ More In this work, we measure and model tilted x-ray refractive lenses to investigate their effects on an x-ray beam. The modelling is benchmarked against at-wavelength metrology obtained with x-ray speckle vector tracking experiments (XSVT) at the BM05 beamline at the ESRF-EBS light source, showing very good agreement. This validation permits us to explore possible applications of tilted x-ray lenses in optical design: we demonstrate that tilting 1D lenses around their focusing direction can be used for fine-tuning their focal length with possible applications in beamline optical design. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: 15 pages, 13 figures, 38 references to be submitted to Optics Express

arXiv:2211.10973 [pdf, other]

FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms

Authors: Peng Qi, Yuyan Bu, Juan Cao, Wei Ji, Ruihao Shui, Junbin Xiao, Danding Wang, Tat-Seng Chua

Abstract: Short video platforms have become an important channel for news sharing, but also a new breeding ground for fake news. To mitigate this problem, research of fake news video detection has recently received a lot of attention. Existing works face two roadblocks: the scarcity of comprehensive and largescale datasets and insufficient utilization of multimodal information. Therefore, in this paper, we… ▽ More Short video platforms have become an important channel for news sharing, but also a new breeding ground for fake news. To mitigate this problem, research of fake news video detection has recently received a lot of attention. Existing works face two roadblocks: the scarcity of comprehensive and largescale datasets and insufficient utilization of multimodal information. Therefore, in this paper, we construct the largest Chinese short video dataset about fake news named FakeSV, which includes news content, user comments, and publisher profiles simultaneously. To understand the characteristics of fake news videos, we conduct exploratory analysis of FakeSV from different perspectives. Moreover, we provide a new multimodal detection model named SV-FEND, which exploits the cross-modal correlations to select the most informative features and utilizes the social context information for detection. Extensive experiments evaluate the superiority of the proposed method and provide detailed comparisons of different methods and modalities for future works. △ Less

Submitted 2 December, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

Comments: To appear in AAAI 2023 AISI track. This version contains appendix with additional details

arXiv:2210.07126 [pdf, other]

Challenges in Explanation Quality Evaluation

Authors: Hendrik Schuff, Heike Adel, Peng Qi, Ngoc Thang Vu

Abstract: While much research focused on producing explanations, it is still unclear how the produced explanations' quality can be evaluated in a meaningful way. Today's predominant approach is to quantify explanations using proxy scores which compare explanations to (human-annotated) gold explanations. This approach assumes that explanations which reach higher proxy scores will also provide a greater benef… ▽ More While much research focused on producing explanations, it is still unclear how the produced explanations' quality can be evaluated in a meaningful way. Today's predominant approach is to quantify explanations using proxy scores which compare explanations to (human-annotated) gold explanations. This approach assumes that explanations which reach higher proxy scores will also provide a greater benefit to human users. In this paper, we present problems of this approach. Concretely, we (i) formulate desired characteristics of explanation quality, (ii) describe how current evaluation practices violate them, and (iii) support our argumentation with initial evidence from a crowdsourcing case study in which we investigate the explanation quality of state-of-the-art explainable question answering systems. We find that proxy scores correlate poorly with human quality ratings and, additionally, become less expressive the more often they are used (i.e. following Goodhart's law). Finally, we propose guidelines to enable a meaningful evaluation of explanations to drive the development of systems that provide tangible benefits to human users. △ Less

Submitted 9 March, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: 41 pages, 11 figures

arXiv:2210.06633 [pdf, other]

Language Agnostic Multilingual Information Retrieval with Contrastive Learning

Authors: Xiyang Hu, Xinchi Chen, Peng Qi, Deguang Kong, Kunlun Liu, William Yang Wang, Zhiheng Huang

Abstract: Multilingual information retrieval (IR) is challenging since annotated training data is costly to obtain in many languages. We present an effective method to train multilingual IR systems when only English IR training data and some parallel corpora between English and other languages are available. We leverage parallel and non-parallel corpora to improve the pretrained multilingual language models… ▽ More Multilingual information retrieval (IR) is challenging since annotated training data is costly to obtain in many languages. We present an effective method to train multilingual IR systems when only English IR training data and some parallel corpora between English and other languages are available. We leverage parallel and non-parallel corpora to improve the pretrained multilingual language models' cross-lingual transfer ability. We design a semantic contrastive loss to align representations of parallel sentences that share the same semantics in different languages, and a new language contrastive loss to leverage parallel sentence pairs to remove language-specific information in sentence representations from non-parallel corpora. When trained on English IR data with these losses and evaluated zero-shot on non-English data, our model demonstrates significant improvement to prior work on retrieval performance, while it requires much less computational effort. We also demonstrate the value of our model for a practical setting when a parallel corpus is only available for a few languages, but a lack of parallel corpora resources persists for many other low-resource languages. Our model can work well even with a small number of parallel sentences, and be used as an add-on module to any backbones and other tasks. △ Less

Submitted 25 May, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: ACL Findings 2023

arXiv:2208.02169 [pdf, other]

SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences

Authors: Peng Qi, Guangtao Wang, **g Huang

Abstract: Distilling supervision signal from a long sequence to make predictions is a challenging task in machine learning, especially when not all elements in the input sequence contribute equally to the desired output. In this paper, we propose SpanDrop, a simple and effective data augmentation technique that helps models identify the true supervision signal in a long sequence with very few examples. By d… ▽ More Distilling supervision signal from a long sequence to make predictions is a challenging task in machine learning, especially when not all elements in the input sequence contribute equally to the desired output. In this paper, we propose SpanDrop, a simple and effective data augmentation technique that helps models identify the true supervision signal in a long sequence with very few examples. By directly manipulating the input sequence, SpanDrop randomly ablates parts of the sequence at a time and ask the model to perform the same task to emulate counterfactual learning and achieve input attribution. Based on theoretical analysis of its properties, we also propose a variant of SpanDrop based on the beta-Bernoulli distribution, which yields diverse augmented sequences while providing a learning objective that is more consistent with the original dataset. We demonstrate the effectiveness of SpanDrop on a set of carefully designed toy tasks, as well as various natural language processing tasks that require reasoning over long sequences to arrive at the correct answer, and show that it helps models improve performance both when data is scarce and abundant. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: Peng Qi and Guangtao Wang contributed equally

arXiv:2207.12021 [pdf, other]

Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent

Authors: Ethan A. Chi, Ashwin Paranjape, Abigail See, Caleb Chiam, Trenton Chang, Kathleen Kenealy, Swee Kiat Lim, Amelia Hardy, Chetanya Rastogi, Haojun Li, Alexander Iyabor, Yutong He, Hari Sowrirajan, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Jillian Tang, Avanika Narayan, Giovanni Campagna, Christopher D. Manning

Abstract: We present Chirpy Cardinal, an open-domain social chatbot. Aiming to be both informative and conversational, our bot chats with users in an authentic, emotionally intelligent way. By integrating controlled neural generation with scaffolded, hand-written dialogue, we let both the user and bot take turns driving the conversation, producing an engaging and socially fluent experience. Deployed in the… ▽ More We present Chirpy Cardinal, an open-domain social chatbot. Aiming to be both informative and conversational, our bot chats with users in an authentic, emotionally intelligent way. By integrating controlled neural generation with scaffolded, hand-written dialogue, we let both the user and bot take turns driving the conversation, producing an engaging and socially fluent experience. Deployed in the fourth iteration of the Alexa Prize Socialbot Grand Challenge, Chirpy Cardinal handled thousands of conversations per day, placing second out of nine bots with an average user rating of 3.58/5. △ Less

Submitted 16 January, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

Comments: SIGDIAL '22

arXiv:2206.13030 [pdf]

doi 10.1103/PhysRevB.105.245106

Hybridization and Correlation between f- and d-orbital electrons in a valence fluctuating compound EuNi2P2

Authors: Z. X. Yin, X. Du, W. Z. Cao, J. Jiang, C. Chen, S. R. Duan, J. S. Zhou, X. Gu, R. Z. Xu, Q. Q. Zhang, W. X. Zhao, Y. D. Li, Yi-feng Yang, H. F. Yang, A. J. Liang, Z. K. Liu, H. Yao, Y. P. Qi, Y. L. Chen, L. X. Yang

Abstract: The interaction between localized f and itinerant conduction electrons is crucial in the electronic properties of heavy fermion and valence fluctuating compounds. Using high-resolution angle-resolved photoemission spectroscopy, we systematically investigate the electronic structure of the archetypical valence fluctuating compound EuNi2P2 that hosts multiple f electrons. At low temperatures, we rev… ▽ More The interaction between localized f and itinerant conduction electrons is crucial in the electronic properties of heavy fermion and valence fluctuating compounds. Using high-resolution angle-resolved photoemission spectroscopy, we systematically investigate the electronic structure of the archetypical valence fluctuating compound EuNi2P2 that hosts multiple f electrons. At low temperatures, we reveal the hybridization between Eu 4f and Ni 3d states, which contributes to the electron mass enhancement, consistent with the periodic Anderson model. With increasing temperature, interestingly, we observe opposite temperature evolution of electron spectral function above and below the Kondo coherence temperature near 110 K, which is in contrast to the monotonic valence change and beyond the expectation of the periodic Anderson model. We argue that both f-d hybridization and correlation are imperative in the electronic properties of EuNi2P2. Our results shed light on the understanding of novel properties, such as heavy fermion behaviors and valence fluctuation, of rare-earth transition-metal intermetallic compounds with multiple f electrons. △ Less

Submitted 26 June, 2022; originally announced June 2022.

Journal ref: Phys. Rev. B 105, 245106 (2022)

arXiv:2203.10542 [pdf]

doi 10.1103/PhysRevB.105.155108

Robust Kagome Electronic Structure in Topological Quantum Magnets XMn6Sn6 (X = Dy, Tb, Gd, Y)

Authors: X. Gu, C. Chen, W. S. Wei, J. Y. Liu, X. Du, D. Pei, J. S. Zhou, R. Z. Xu, Z. X. Yin, W. X. Zhao, Y. D. Li, C. Jozwiak, A. Bostwick, E. Rotenberg, D. Backes, L. S. I. Veiga, S. Dhesi, T. Hesjedal, G. van der Laan, H. F. Du, W. J. Jiang, Y. P. Qi, G. Li, W. J. Shi, Z. K. Liu , et al. (2 additional authors not shown)

Abstract: Crystal geometry can greatly influence the emergent properties of quantum materials. As an example, the kagome lattice is an ideal platform to study the rich interplay between topology, magnetism, and electronic correlation. In this work, combining high-resolution angle-resolved photoemission spectroscopy and ab-initio calculation, we systematically investigate the electronic structure of XMn6Sn6… ▽ More Crystal geometry can greatly influence the emergent properties of quantum materials. As an example, the kagome lattice is an ideal platform to study the rich interplay between topology, magnetism, and electronic correlation. In this work, combining high-resolution angle-resolved photoemission spectroscopy and ab-initio calculation, we systematically investigate the electronic structure of XMn6Sn6 (X = Dy, Tb, Gd, Y) family compounds. We observe the Dirac fermion and the flat band arising from the magnetic kagome lattice of Mn atoms. Interestingly, the flat band locates in the same energy region in all compounds studied, regardless of their different magnetic ground states and 4f electronic configurations. These observations suggest a robust Mn magnetic kagome lattice across the XMn6Sn6 family, thus providing an ideal platform for the search and investigation on new emergent phenomena in magnetic topological materials. △ Less

Submitted 20 March, 2022; originally announced March 2022.

Comments: PRB accepted

arXiv:2203.09121 [pdf, other]

DRAG: Dynamic Region-Aware GCN for Privacy-Leaking Image Detection

Authors: Guang Yang, Juan Cao, Qiang Sheng, Peng Qi, Xirong Li, **tao Li

Abstract: The daily practice of sharing images on social media raises a severe issue about privacy leakage. To address the issue, privacy-leaking image detection is studied recently, with the goal to automatically identify images that may leak privacy. Recent advance on this task benefits from focusing on crucial objects via pretrained object detectors and modeling their correlation. However, these methods… ▽ More The daily practice of sharing images on social media raises a severe issue about privacy leakage. To address the issue, privacy-leaking image detection is studied recently, with the goal to automatically identify images that may leak privacy. Recent advance on this task benefits from focusing on crucial objects via pretrained object detectors and modeling their correlation. However, these methods have two limitations: 1) they neglect other important elements like scenes, textures, and objects beyond the capacity of pretrained object detectors; 2) the correlation among objects is fixed, but a fixed correlation is not appropriate for all the images. To overcome the limitations, we propose the Dynamic Region-Aware Graph Convolutional Network (DRAG) that dynamically finds out crucial regions including objects and other important elements, and models their correlation adaptively for each input image. To find out crucial regions, we cluster spatially-correlated feature channels into several region-aware feature maps. Further, we dynamically model the correlation with the self-attention mechanism and explore the interaction among the regions with a graph convolutional network. The DRAG achieved an accuracy of 87% on the largest dataset for privacy-leaking image detection, which is 10 percentage points higher than the state of the art. The further case study demonstrates that it found out crucial regions containing not only objects but other important elements like textures. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: Accepted to AAAI-22, 9 pages

arXiv:2203.00255 [pdf, other]

Improving Time Sensitivity for Question Answering over Temporal Knowledge Graphs

Authors: Chao Shang, Guangtao Wang, Peng Qi, **g Huang

Abstract: Question answering over temporal knowledge graphs (KGs) efficiently uses facts contained in a temporal KG, which records entity relations and when they occur in time, to answer natural language questions (e.g., "Who was the president of the US before Obama?"). These questions often involve three time-related challenges that previous work fail to adequately address: 1) questions often do not specif… ▽ More Question answering over temporal knowledge graphs (KGs) efficiently uses facts contained in a temporal KG, which records entity relations and when they occur in time, to answer natural language questions (e.g., "Who was the president of the US before Obama?"). These questions often involve three time-related challenges that previous work fail to adequately address: 1) questions often do not specify exact timestamps of interest (e.g., "Obama" instead of 2000); 2) subtle lexical differences in time relations (e.g., "before" vs "after"); 3) off-the-shelf temporal KG embeddings that previous work builds on ignore the temporal order of timestamps, which is crucial for answering temporal-order related questions. In this paper, we propose a time-sensitive question answering (TSQA) framework to tackle these problems. TSQA features a timestamp estimation module to infer the unwritten timestamp from the question. We also employ a time-sensitive KG encoder to inject ordering information into the temporal KG embeddings that TSQA is based on. With the help of techniques to reduce the search space for potential answers, TSQA significantly outperforms the previous state of the art on a new benchmark for question answering over temporal KGs, especially achieving a 32% (absolute) error reduction on complex questions that require multiple steps of reasoning over facts in the temporal KG. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Comments: 10 pages, 2 figures

Journal ref: ACL 2022

arXiv:2202.02739 [pdf, other]

doi 10.1088/1361-648X/ac577a

$s$-wave superconductivity in the noncentrosymmetric W$_3$Al$_2$C superconductor: An NMR study

Authors: D. Tay, T. Shang, Y. P. Qi, T. P. Ying, H. Hosono, H-R. Ott, T. Shiroka

Abstract: We report on a microscopic study of the noncentrosymmetric superconductor W$_3$Al$_2$C (with $T_c$ = 7.6 K), mostly by means of $^{27}$Al- and $^{13}$C nuclear magnetic resonance (NMR). Since in this material the density of states at the Fermi level is dominated by the tungsten's 5$d$ orbitals, we expect a sizeable spin-orbit coupling (SOC) effect. The normal-state electronic properties of W$_3$Al… ▽ More We report on a microscopic study of the noncentrosymmetric superconductor W$_3$Al$_2$C (with $T_c$ = 7.6 K), mostly by means of $^{27}$Al- and $^{13}$C nuclear magnetic resonance (NMR). Since in this material the density of states at the Fermi level is dominated by the tungsten's 5$d$ orbitals, we expect a sizeable spin-orbit coupling (SOC) effect. The normal-state electronic properties of W$_3$Al$_2$C resemble those of a standard metal, but with a Korringa product $1/(T_{1}T)$ significantly smaller than that of metallic Al, reflecting the marginal role played by $s$-electrons. In the superconducting state, we observe a reduction of the Knight shift and an exponential decrease of the NMR relaxation rate $1/T_1$, typical of $s$-wave superconductivity. This is further supported by the observation of a small but distinct coherence peak just below $T_c$ in the $^{13}$C NMR relaxation-rate, in agreement with the fully-gapped superconducting state inferred from the electronic specific-heat data well below $T_c$. The above features are compared to those of members of the same family, in particular, Mo$_3$Al$_2$C, often claimed to exhibit unconventional superconductivity. We discuss why, despite the enhanced SOC, W$_3$Al$_2$C does not show spin-triplet features in its superconducting state and consider the broader consequences of our results for noncentrosymmetric superconductors in general. △ Less

Submitted 6 February, 2022; originally announced February 2022.

Comments: 8 pages, 6 figures, accepted by J. Phys.: Condens. Matter

Journal ref: J. Phys.: Condens. Matter 34 194005 (2022)

arXiv:2201.08806 [pdf, other]

doi 10.1103/PhysRevAccelBeams.25.050701

Signatures of misalignment in x-ray cavities of cavity-based x-ray free-electron lasers

Authors: Peng Qi, Yuri Shvyd'ko

Abstract: Cavity-based x-ray free-electron lasers (CBXFEL) will allow use of optical cavity feedback to support generation of fully coherent x-rays of high brilliance and stability by electrons in undulators. CBXFEL optical cavities comprise Bragg-reflecting flat crystal mirrors, which ensure x-rays circulation on a closed orbit, and x-ray refractive lenses, which stabilize the orbit and refocus the x-rays… ▽ More Cavity-based x-ray free-electron lasers (CBXFEL) will allow use of optical cavity feedback to support generation of fully coherent x-rays of high brilliance and stability by electrons in undulators. CBXFEL optical cavities comprise Bragg-reflecting flat crystal mirrors, which ensure x-rays circulation on a closed orbit, and x-ray refractive lenses, which stabilize the orbit and refocus the x-rays back on the electrons in the undulator. Depending on the cavity design, there are tens of degrees of freedom of the optical elements, which can never be perfectly aligned. Here, we study signatures of misalignment of the optical components and of the undulator source with the purposes of understanding the effects of misalignment on x-ray beam dynamics, understanding misalignment tolerances, and develo** cavity alignment procedures. Betatron oscillations of the x-ray beam trajectory (both symmetric and asymmetric) are one of the characteristic signatures of cavity misalignment. The oscillation period is in the general case a non-integer number of round-trip passes of x-rays in the cavity. This period (unlike the amplitude and offset of the oscillations) is independent of the type of misalignment and is defined by cavity parameters. The studies are performed on an example of a four-crystal rectangular cavity using analytical and numerical wave optics as well as ray-tracing techniques. Both confocal and generic stable cavity types are studied. △ Less

Submitted 23 March, 2022; v1 submitted 21 January, 2022; originally announced January 2022.

Comments: 20 pages, 24 figures

Journal ref: Phys. Rev. Accel. Beams 25 (2022) 050701

arXiv:2201.03014 [pdf, other]

Glance and Focus Networks for Dynamic Visual Recognition

Authors: Gao Huang, Yulin Wang, Kangchen Lv, Haojun Jiang, Wenhui Huang, Pengfei Qi, Shiji Song

Abstract: Spatial redundancy widely exists in visual recognition tasks, i.e., discriminative features in an image or video frame usually correspond to only a subset of pixels, while the remaining regions are irrelevant to the task at hand. Therefore, static models which process all the pixels with an equal amount of computation result in considerable redundancy in terms of time and space consumption. In thi… ▽ More Spatial redundancy widely exists in visual recognition tasks, i.e., discriminative features in an image or video frame usually correspond to only a subset of pixels, while the remaining regions are irrelevant to the task at hand. Therefore, static models which process all the pixels with an equal amount of computation result in considerable redundancy in terms of time and space consumption. In this paper, we formulate the image recognition problem as a sequential coarse-to-fine feature learning process, mimicking the human visual system. Specifically, the proposed Glance and Focus Network (GFNet) first extracts a quick global representation of the input image at a low resolution scale, and then strategically attends to a series of salient (small) regions to learn finer features. The sequential process naturally facilitates adaptive inference at test time, as it can be terminated once the model is sufficiently confident about its prediction, avoiding further redundant computation. It is worth noting that the problem of locating discriminant regions in our model is formulated as a reinforcement learning task, thus requiring no additional manual annotations other than classification labels. GFNet is general and flexible as it is compatible with any off-the-shelf backbone models (such as MobileNets, EfficientNets and TSM), which can be conveniently deployed as the feature extractor. Extensive experiments on a variety of image classification and video recognition tasks and with various backbone models demonstrate the remarkable efficiency of our method. For example, it reduces the average latency of the highly efficient MobileNet-V3 on an iPhone XS Max by 1.3x without sacrificing accuracy. Code and pre-trained models are available at https://github.com/blackfeather-wang/GFNet-Pytorch. △ Less

Submitted 4 August, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI). Journal version of arXiv:2010.05300 (NeurIPS 2020). The first two authors contributed equally

arXiv:2110.10030 [pdf, other]

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization

Authors: Panjie Qi, Edwin Hsing-Mean Sha, Qingfeng Zhuge, Hongwu Peng, Shaoyi Huang, Zhenglun Kong, Yuhong Song, Bingbing Li

Abstract: State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, with the development of technology, more and more embedded devices are available to run a Transformer model. For a Transformer model with different constraints (tight or loose), it can be deployed onto devices with different computing power. Howe… ▽ More State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, with the development of technology, more and more embedded devices are available to run a Transformer model. For a Transformer model with different constraints (tight or loose), it can be deployed onto devices with different computing power. However, in previous work, designers did not choose the best device among multiple devices. Instead, they just used an existing device to deploy model, which was not necessarily the best fit and may lead to underutilization of resources. To address the deployment challenge of Transformer and the problem to select the best device, we propose an algorithm & hardware closed-loop acceleration framework. Given a dataset, a model, latency constraint LC and accuracy constraint AC, our framework can provide a best device satisfying both constraints. In order to generate a compressed model with high sparsity ratio, we propose a novel pruning technique, hierarchical pruning (HP). We optimize the sparse matrix storage format for HP matrix to further reduce memory usage for FPGA implementation. We design a accelerator that takes advantage of HP to solve the problem of concurrent random access. Experiments on Transformer and TinyBert model show that our framework can find different devices for various LC and AC, covering from low-end devices to high-end devices. Our HP can achieve higher sparsity ratio and is more flexible than other sparsity pattern. Our framework can achieve 37x, 1.9x, 1.7x speedup compared to CPU, GPU and FPGA, respectively. △ Less

Submitted 19 October, 2021; originally announced October 2021.

ACM Class: C.3; I.2

arXiv:2110.01167 [pdf, other]

Trustworthy AI: From Principles to Practices

Authors: Bo Li, Peng Qi, Bo Liu, Shuai Di, **gen Liu, Jiquan Pei, **feng Yi, Bowen Zhou

Abstract: The rapid development of Artificial Intelligence (AI) technology has enabled the deployment of various systems based on it. However, many current AI systems are found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in user privacy protection. These shortcomings degrade user experience and erode people's trust in all AI systems. In this review, we provide AI pra… ▽ More The rapid development of Artificial Intelligence (AI) technology has enabled the deployment of various systems based on it. However, many current AI systems are found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in user privacy protection. These shortcomings degrade user experience and erode people's trust in all AI systems. In this review, we provide AI practitioners with a comprehensive guide for building trustworthy AI systems. We first introduce the theoretical framework of important aspects of AI trustworthiness, including robustness, generalization, explainability, transparency, reproducibility, fairness, privacy preservation, and accountability. To unify currently available but fragmented approaches toward trustworthy AI, we organize them in a systematic approach that considers the entire lifecycle of AI systems, ranging from data acquisition to model development, to system development and deployment, finally to continuous monitoring and governance. In this framework, we offer concrete action items for practitioners and societal stakeholders (e.g., researchers, engineers, and regulators) to improve AI trustworthiness. Finally, we identify key opportunities and challenges for the future development of trustworthy AI systems, where we identify the need for a paradigm shift toward comprehensively trustworthy AI systems. △ Less

Submitted 26 May, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

arXiv:2108.10509 [pdf, other]

doi 10.1145/3474085.3481548

Improving Fake News Detection by Using an Entity-enhanced Framework to Fuse Diverse Multimodal Clues

Authors: Peng Qi, Juan Cao, Xirong Li, Huan Liu, Qiang Sheng, Xiaoyue Mi, Qin He, Yongbiao Lv, Chenyang Guo, Yingchao Yu

Abstract: Recently, fake news with text and images have achieved more effective diffusion than text-only fake news, raising a severe issue of multimodal fake news detection. Current studies on this issue have made significant contributions to develo** multimodal models, but they are defective in modeling the multimodal content sufficiently. Most of them only preliminarily model the basic semantics of the… ▽ More Recently, fake news with text and images have achieved more effective diffusion than text-only fake news, raising a severe issue of multimodal fake news detection. Current studies on this issue have made significant contributions to develo** multimodal models, but they are defective in modeling the multimodal content sufficiently. Most of them only preliminarily model the basic semantics of the images as a supplement to the text, which limits their performance on detection. In this paper, we find three valuable text-image correlations in multimodal fake news: entity inconsistency, mutual enhancement, and text complementation. To effectively capture these multimodal clues, we innovatively extract visual entities (such as celebrities and landmarks) to understand the news-related high-level semantics of images, and then model the multimodal entity inconsistency and mutual enhancement with the help of visual entities. Moreover, we extract the embedded text in images as the complementation of the original text. All things considered, we propose a novel entity-enhanced multimodal fusion framework, which simultaneously models three cross-modal correlations to detect diverse multimodal fake news. Extensive experiments demonstrate the superiority of our model compared to the state of the art. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: To appear in MM 2021 industrial track (long paper)

arXiv:2108.02317 [pdf]

Efficient Fourier single-pixel imaging with Gaussian random sampling

Authors: Ziheng Qiu, Xinyi Guo, Tianao Lu, Pan Qi, Zibang Zhang, **gang Zhong

Abstract: Fourier single-pixel imaging (FSI) is a branch of single-pixel imaging techniques. It uses Fourier basis patterns as structured patterns for spatial information acquisition in the Fourier domain. However, the spatial resolution of the image reconstructed by FSI mainly depends on the number of Fourier coefficients sampled. The reconstruction of a high-resolution image typically requires a number of… ▽ More Fourier single-pixel imaging (FSI) is a branch of single-pixel imaging techniques. It uses Fourier basis patterns as structured patterns for spatial information acquisition in the Fourier domain. However, the spatial resolution of the image reconstructed by FSI mainly depends on the number of Fourier coefficients sampled. The reconstruction of a high-resolution image typically requires a number of Fourier coefficients to be sampled, and therefore takes a long data acquisition time. Here we propose a new sampling strategy for FSI. It allows FSI to reconstruct a clear and sharp image with a reduced number of measurements. The core of the proposed sampling strategy is to perform a variable density sampling in the Fourier space and, more importantly, the density with respect to the importance of Fourier coefficients is subject to a one-dimensional Gaussian function. Combined with compressive sensing, the proposed sampling strategy enables better reconstruction quality than conventional sampling strategies, especially when the sampling ratio is low. We experimentally demonstrate compressive FSI combined with the proposed sampling strategy is able to reconstruct a sharp and clear image of 256-by-256 pixels with a sampling ratio of 10%. The proposed method enables fast single-pixel imaging and provides a new approach for efficient spatial information acquisition. △ Less

Submitted 28 June, 2021; originally announced August 2021.

arXiv:2106.10401 [pdf]

Parallel frequency function-deep neural network for efficient complex broadband signal approximation

Authors: Zhi Zeng, Pengpeng Shi, Fulei Ma, Peihan Qi

Abstract: A neural network is essentially a high-dimensional complex map** model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency… ▽ More A neural network is essentially a high-dimensional complex map** model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency band extraction and frequency shift techniques [Cai et al. SIAM J. SCI. COMPUT. 42, A3285 (2020)]. Our paper is devoted to an alternative candidate for fitting complex signals with high-frequency components. Here, a parallel frequency function-deep neural network (PFF-DNN) is proposed to suppress computational overhead while ensuring fitting accuracy by utilizing fast Fourier analysis of broadband signals and the spectral bias nature of neural networks. The effectiveness and efficiency of the proposed PFF-DNN method are verified based on detailed numerical experiments for six typical broadband signals. △ Less

Submitted 18 June, 2021; originally announced June 2021.

arXiv:2105.12951 [pdf, other]

VeniBot: Towards Autonomous Venipuncture with Automatic Puncture Area and Angle Regression from NIR Images

Authors: Xu Cao, Zijie Chen, Bolin Lai, Yuxuan Wang, Yu Chen, Zhengqing Cao, Zhilin Yang, Nanyang Ye, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

Abstract: Venipucture is a common step in clinical scenarios, and is with highly practical value to be automated with robotics. Nowadays, only a few on-shelf robotic systems are developed, however, they can not fulfill practical usage due to varied reasons. In this paper, we develop a compact venipucture robot -- VeniBot, with four parts, six motors and two imaging devices. For the automation, we focus on t… ▽ More Venipucture is a common step in clinical scenarios, and is with highly practical value to be automated with robotics. Nowadays, only a few on-shelf robotic systems are developed, however, they can not fulfill practical usage due to varied reasons. In this paper, we develop a compact venipucture robot -- VeniBot, with four parts, six motors and two imaging devices. For the automation, we focus on the positioning part and propose a Dual-In-Dual-Out network based on two-step learning and two-task learning, which can achieve fully automatic regression of the suitable puncture area and angle from near-infrared(NIR) images. The regressed suitable puncture area and angle can further navigate the positioning part of VeniBot, which is an important step towards a fully autonomous venipucture robot. Validation on 30 VeniBot-collected volunteers shows a high mean dice coefficient(DSC) of 0.7634 and a low angle error of 15.58° on suitable puncture area and angle regression respectively, indicating its potentially wide and practical application in the future. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2105.12945 [pdf, other]

VeniBot: Towards Autonomous Venipuncture with Semi-supervised Vein Segmentation from Ultrasound Images

Authors: Yu Chen, Yuxuan Wang, Bolin Lai, Zijie Chen, Xu Cao, Nanyang Ye, Zhongyuan Ren, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

Abstract: In the modern medical care, venipuncture is an indispensable procedure for both diagnosis and treatment. In this paper, unlike existing solutions that fully or partially rely on professional assistance, we propose VeniBot -- a compact robotic system solution integrating both novel hardware and software developments. For the hardware, we design a set of units to facilitate the supporting, positioni… ▽ More In the modern medical care, venipuncture is an indispensable procedure for both diagnosis and treatment. In this paper, unlike existing solutions that fully or partially rely on professional assistance, we propose VeniBot -- a compact robotic system solution integrating both novel hardware and software developments. For the hardware, we design a set of units to facilitate the supporting, positioning, puncturing and imaging functionalities. For the software, to move towards a full automation, we propose a novel deep learning framework -- semi-ResNeXt-Unet for semi-supervised vein segmentation from ultrasound images. From which, the depth information of vein is calculated and used to enable automated navigation for the puncturing unit. VeniBot is validated on 40 volunteers, where ultrasound images can be collected successfully. For the vein segmentation validation, the proposed semi-ResNeXt-Unet improves the dice similarity coefficient (DSC) by 5.36%, decreases the centroid error by 1.38 pixels and decreases the failure rate by 5.60%, compared to fully-supervised ResNeXt-Unet. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2105.06457 [pdf, ps, other]

Conversational AI Systems for Social Good: Opportunities and Challenges

Authors: Peng Qi, **g Huang, Youzheng Wu, Xiaodong He, Bowen Zhou

Abstract: Conversational artificial intelligence (ConvAI) systems have attracted much academic and commercial attention recently, making significant progress on both fronts. However, little existing work discusses how these systems can be developed and deployed for social good in real-world applications, with comprehensive case studies and analyses of pros and cons. In this paper, we briefly review the prog… ▽ More Conversational artificial intelligence (ConvAI) systems have attracted much academic and commercial attention recently, making significant progress on both fronts. However, little existing work discusses how these systems can be developed and deployed for social good in real-world applications, with comprehensive case studies and analyses of pros and cons. In this paper, we briefly review the progress the community has made towards better ConvAI systems and reflect on how existing technologies can help advance social good initiatives from various angles that are unique for ConvAI, or not yet become common knowledge in the community. We further discuss about the challenges ahead for ConvAI systems to better help us achieve these goals and highlight the risks involved in their development and deployment in the real world. △ Less

Submitted 7 January, 2022; v1 submitted 13 May, 2021; originally announced May 2021.

arXiv:2103.11794 [pdf, other]

Graph Ensemble Learning over Multiple Dependency Trees for Aspect-level Sentiment Classification

Authors: Xiaochen Hou, Peng Qi, Guangtao Wang, Rex Ying, **g Huang, Xiaodong He, Bowen Zhou

Abstract: Recent work on aspect-level sentiment classification has demonstrated the efficacy of incorporating syntactic structures such as dependency trees with graph neural networks(GNN), but these approaches are usually vulnerable to parsing errors. To better leverage syntactic information in the face of unavoidable errors, we propose a simple yet effective graph ensemble technique, GraphMerge, to make us… ▽ More Recent work on aspect-level sentiment classification has demonstrated the efficacy of incorporating syntactic structures such as dependency trees with graph neural networks(GNN), but these approaches are usually vulnerable to parsing errors. To better leverage syntactic information in the face of unavoidable errors, we propose a simple yet effective graph ensemble technique, GraphMerge, to make use of the predictions from differ-ent parsers. Instead of assigning one set of model parameters to each dependency tree, we first combine the dependency relations from different parses before applying GNNs over the resulting graph. This allows GNN mod-els to be robust to parse errors at no additional computational cost, and helps avoid overparameterization and overfitting from GNN layer stacking by introducing more connectivity into the ensemble graph. Our experiments on the SemEval 2014 Task 4 and ACL 14 Twitter datasets show that our GraphMerge model not only outperforms models with single dependency tree, but also beats other ensemble mod-els without adding model parameters. △ Less

Submitted 12 March, 2021; originally announced March 2021.

Comments: Accepted by NAACL 2021

arXiv:2102.06336 [pdf, ps, other]

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

Authors: Yuhong Song, Weiwen Jiang, Bingbing Li, Panjie Qi, Qingfeng Zhuge, Edwin Hsing-Mean Sha, Sakyasingha Dasgupta, Yiyu Shi, Caiwen Ding

Abstract: A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching models for dynamic hardware conditions) at run-time. Such reconfigurability is the key to save energy for battery-power… ▽ More A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching models for dynamic hardware conditions) at run-time. Such reconfigurability is the key to save energy for battery-powered mobile devices, which widely use dynamic voltage and frequency scaling (DVFS) technique for hardware reconfiguration to prolong battery life. In this work, we creatively explore a hybrid block-structured pruning (BP) and pattern pruning (PP) for Transformer-based models and first attempt to combine hardware and software reconfiguration to maximally save energy for battery-powered mobile devices. Specifically, RT3 integrates two-level optimizations: First, it utilizes an efficient BP as the first-step compression for resource-constrained mobile devices; then, RT3 heuristically generates a shrunken search space based on the first level optimization and searches multiple pattern sets with diverse sparsity for PP via reinforcement learning to support lightweight software reconfiguration, which corresponds to available frequency levels of DVFS (i.e., hardware reconfiguration). At run-time, RT3 can switch the lightweight pattern sets within 45ms to guarantee the required real-time constraint at different frequency levels. Results further show that RT3 can prolong battery life over 4x improvement with less than 1% accuracy loss for Transformer and 1.5% score decrease for DistilBERT. △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: 7 pages, 5 figures

arXiv:2012.13169 [pdf, other]

SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II

Authors: Xiangjun Wang, Junxiao Song, Penghui Qi, Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, Haitao Long, Quan Yuan

Abstract: AlphaStar, the AI that reaches GrandMaster level in StarCraft II, is a remarkable milestone demonstrating what deep reinforcement learning can achieve in complex Real-Time Strategy (RTS) games. However, the complexities of the game, algorithms and systems, and especially the tremendous amount of computation needed are big obstacles for the community to conduct further research in this direction. W… ▽ More AlphaStar, the AI that reaches GrandMaster level in StarCraft II, is a remarkable milestone demonstrating what deep reinforcement learning can achieve in complex Real-Time Strategy (RTS) games. However, the complexities of the game, algorithms and systems, and especially the tremendous amount of computation needed are big obstacles for the community to conduct further research in this direction. We propose a deep reinforcement learning agent, StarCraft Commander (SCC). With order of magnitude less computation, it demonstrates top human performance defeating GrandMaster players in test matches and top professional players in a live event. Moreover, it shows strong robustness to various human strategies and discovers novel strategies unseen from human plays. In this paper, we will share the key insights and optimizations on efficient imitation learning and reinforcement learning for StarCraft II full game. △ Less

Submitted 9 June, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

Comments: ICML 2021 camera ready

arXiv:2011.11337 [pdf, other]

DemodNet: Learning Soft Demodulation from Hard Information Using Convolutional Neural Network

Authors: Shilian Zheng, Xiaoyu Zhou, Shichuan Chen, Peihan Qi, Xiaoniu Yang

Abstract: Soft demodulation is a basic module of traditional communication receivers. It converts received symbols into soft bits, that is, log likelihood ratios (LLRs). However, in the nonideal additive white Gaussian noise (AWGN) channel, it is difficult to accurately calculate the LLR. In this letter, we propose a demodulator, DemodNet, based on a fully convolutional neural network with variable input an… ▽ More Soft demodulation is a basic module of traditional communication receivers. It converts received symbols into soft bits, that is, log likelihood ratios (LLRs). However, in the nonideal additive white Gaussian noise (AWGN) channel, it is difficult to accurately calculate the LLR. In this letter, we propose a demodulator, DemodNet, based on a fully convolutional neural network with variable input and output length. We use hard bit information to train the DemodNet, and we propose log probability ratio (LPR) based on the output layer of the trained DemodNet to realize soft demodulation. The simulation results show that under the AWGN channel, the performance of both hard demodulation and soft demodulation of DemodNet is very close to the traditional methods. In three non-ideal channel scenarios, i.e., the presence of frequency deviation, additive generalized Gaussian noise (AGGN) channel, and Rayleigh fading channel, the performance of channel decoding using the soft information LPR obtained by DemodNet is better than the performance of decoding using the exact LLR calculated under the ideal AWGN assumption. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: 5 pages, 6 figures

arXiv:2010.12527 [pdf, other]

Answering Open-Domain Questions of Varying Reasoning Steps from Text

Authors: Peng Qi, Haejun Lee, Oghenetegiri "TG" Sido, Christopher D. Manning

Abstract: We develop a unified system to answer directly from text open-domain questions that may require a varying number of retrieval steps. We employ a single multi-task transformer model to perform all the necessary subtasks -- retrieving supporting facts, reranking them, and predicting the answer from all retrieved documents -- in an iterative fashion. We avoid crucial assumptions of previous work that… ▽ More We develop a unified system to answer directly from text open-domain questions that may require a varying number of retrieval steps. We employ a single multi-task transformer model to perform all the necessary subtasks -- retrieving supporting facts, reranking them, and predicting the answer from all retrieved documents -- in an iterative fashion. We avoid crucial assumptions of previous work that do not transfer well to real-world settings, including exploiting knowledge of the fixed number of retrieval steps required to answer each question or using structured metadata like knowledge bases or web links that have limited availability. Instead, we design a system that can answer open-domain questions on any text collection without prior knowledge of reasoning complexity. To emulate this setting, we construct a new benchmark, called BeerQA, by combining existing one- and two-step datasets with a new collection of 530 questions that require three Wikipedia pages to answer, unifying Wikipedia corpora versions in the process. We show that our model demonstrates competitive performance on both existing benchmarks and this new benchmark. We make the new benchmark available at https://beerqa.github.io/. △ Less

Submitted 29 October, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

Comments: EMNLP 2021. Peng Qi, Haejun Lee, and TG Sido contributed equally

arXiv:2009.00419 [pdf, other]

doi 10.1103/PhysRevB.103.174511

Superconducting gap symmetry of the noncentrosymmetric superconductor W3Al2C

Authors: R. Gupta, T. P. Ying, Y. P. Qi, H. Hosono, R. Khasanov

Abstract: A detailed zero-field and transverse-field muon spin relaxation/rotation ($μ$SR) experiemnts have been carried out on the recently discovered non-centrosymmetric superconductor W$_3$Al$_2$C to speculate about its superconducting ground state. Bulk nature of superconductivity below 7.6 K is confirmed through magnetization measurements. No change in the $μ$SR spectra collected above and below $T_c$… ▽ More A detailed zero-field and transverse-field muon spin relaxation/rotation ($μ$SR) experiemnts have been carried out on the recently discovered non-centrosymmetric superconductor W$_3$Al$_2$C to speculate about its superconducting ground state. Bulk nature of superconductivity below 7.6 K is confirmed through magnetization measurements. No change in the $μ$SR spectra collected above and below $T_c$ is visible, ruling out the possibility of spontaneous magnetic field below $T_c$. This confirms that time-reversal symmetry is preserved for W$_3$Al$_2$C upon entering in the superconducting ground state. Temperature dependent superfluid density [$ρ_s(T)$], which directly reflects the superconducting gap symmetry is obtained by the analysis of spectra obtained from the transverse-field $μ$SR experiments. Despite a non-centrosymmetric structure, W$_3$Al$_2$C adopts a fully gaped spin-singlet superconducting ground state with a zero temperature value of gap $Δ_0$ = 1.158(8) meV with gap-to-$T_c$ ratio 2$Δ_0/k_BT_c\approx$3.54, classifying this material as a weakly-coupled superconductors. △ Less

Submitted 1 September, 2020; originally announced September 2020.

Comments: 6 pages, 5 figures

Journal ref: Phys. Rev. B 103, 174511 (2021)

arXiv:2008.12348 [pdf, other]

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations

Authors: Ashwin Paranjape, Abigail See, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D. Manning

Abstract: We present Chirpy Cardinal, an open-domain dialogue agent, as a research platform for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging - such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms - prioritizing their interests… ▽ More We present Chirpy Cardinal, an open-domain dialogue agent, as a research platform for the 2019 Alexa Prize competition. Building an open-domain socialbot that talks to real people is challenging - such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms - prioritizing their interests, feelings and autonomy. As a result, our socialbot provides a responsive, personalized user experience, capable of talking knowledgeably about a wide variety of topics, as well as chatting empathetically about ordinary life. Neural generation plays a key role in achieving these goals, providing the backbone for our conversational and emotional tone. At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over 12 minutes. △ Less

Submitted 5 September, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

Comments: Published in 3rd Proceedings of Alexa Prize (Alexa Prize 2019)

arXiv:2008.09084 [pdf, other]

Do Syntax Trees Help Pre-trained Transformers Extract Information?

Authors: Devendra Singh Sachan, Yuhao Zhang, Peng Qi, William Hamilton

Abstract: Much recent work suggests that incorporating syntax information from dependency trees can improve task-specific transformer models. However, the effect of incorporating dependency tree information into pre-trained transformer models (e.g., BERT) remains unclear, especially given recent studies highlighting how these models implicitly encode syntax. In this work, we systematically study the utility… ▽ More Much recent work suggests that incorporating syntax information from dependency trees can improve task-specific transformer models. However, the effect of incorporating dependency tree information into pre-trained transformer models (e.g., BERT) remains unclear, especially given recent studies highlighting how these models implicitly encode syntax. In this work, we systematically study the utility of incorporating dependency trees into pre-trained transformers on three representative information extraction tasks: semantic role labeling (SRL), named entity recognition, and relation extraction. We propose and investigate two distinct strategies for incorporating dependency structure: a late fusion approach, which applies a graph neural network on the output of a transformer, and a joint fusion approach, which infuses syntax structure into the transformer attention layers. These strategies are representative of prior work, but we introduce additional model design elements that are necessary for obtaining improved performance. Our empirical analysis demonstrates that these syntax-infused transformers obtain state-of-the-art results on SRL and relation extraction tasks. However, our analysis also reveals a critical shortcoming of these models: we find that their performance gains are highly contingent on the availability of human-annotated dependency parses, which raises important questions regarding the viability of syntax-augmented transformers in real-world applications. △ Less

Submitted 26 January, 2021; v1 submitted 20 August, 2020; originally announced August 2020.

Comments: EACL 2021. Code available at: https://github.com/DevSinghSachan/syntax-augmented-bert

Showing 1–50 of 77 results for author: Qi, P