Skip to main content

Showing 1–50 of 453 results for author: Tao, J

.
  1. arXiv:2406.11937  [pdf, other

    physics.ins-det hep-ex physics.data-an

    Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

    Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

    Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More

    Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Prepared for submission to JINST

  2. arXiv:2406.10591  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

    Authors: Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li

    Abstract: Foley audio, critical for enhancing the immersive experience in multimedia content, faces significant challenges in the AI-generated content (AIGC) landscape. Despite advancements in AIGC technologies for text and image generation, the foley audio dubbing remains rudimentary due to difficulties in cross-modal scene matching and content correlation. Current text-to-audio technology, which relies on… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2406.09670  [pdf

    physics.bio-ph cond-mat.soft

    Delayed phosphate release can highly improve energy efficiency of muscle contraction

    Authors: Jiaxiang Xu, Jiangke Tao, Bin Chen

    Abstract: The power stroke of myosin and the release of inorganic phosphate (Pi) are pivotal in the conversion of ATP's chemical energy into mechanical work. Although the precise sequence of these two events remains a subject of debate, it is generally agreed that Pi-release into the solution doesn't occur instantly upon the binding of a myosin to actin. Here, we examine how Pi-release that is not directly… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2406.08112  [pdf, other

    cs.SD cs.AI eess.AS

    Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio

    Authors: Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi

    Abstract: With the proliferation of Large Language Model (LLM) based deepfake audio, there is an urgent need for effective detection methods. Previous deepfake audio generation methods typically involve a multi-step generation process, with the final step using a vocoder to predict the waveform from handcrafted features. However, LLM-based audio is directly generated from discrete neural codecs in an end-to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024. arXiv admin note: substantial text overlap with arXiv:2405.04880

  5. arXiv:2406.07381  [pdf, other

    cs.AI cs.LG

    World Models with Hints of Large Language Models for Goal Achieving

    Authors: Zeyuan Liu, Ziyu Huan, Xiyao Wang, Jiafei Lyu, Jian Tao, Xiu Li, Furong Huang, Huazhe Xu

    Abstract: Reinforcement learning struggles in the face of long-horizon tasks and sparse goals due to the difficulty in manual reward specification. While existing methods address this by adding intrinsic rewards, they may fail to provide meaningful guidance in long-horizon decision-making tasks with large state and action spaces, lacking purposeful exploration. Inspired by human cognition, we propose a new… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. arXiv:2406.06086  [pdf, other

    cs.SD eess.AS

    RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection

    Authors: Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, Cunhang Fan

    Abstract: Fake artefacts for discriminating between bonafide and fake audio can exist in both short- and long-range segments. Therefore, combining local and global feature information can effectively discriminate between bonafide and fake audio. This paper proposes an end-to-end bidirectional state space model, named RawBMamba, to capture both short- and long-range discriminative information for audio deepf… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  7. arXiv:2406.04840  [pdf, other

    cs.SD eess.AS

    TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking

    Authors: Junzuo Zhou, Jiangyan Yi, Tao Wang, Jianhua Tao, Ye Bai, Chu Yuan Zhang, Yong Ren, Zhengqi Wen

    Abstract: Various threats posed by the progress in text-to-speech (TTS) have prompted the need to reliably trace synthesized speech. However, contemporary approaches to this task involve adding watermarks to the audio separately after generation, a process that hurts both speech quality and watermark imperceptibility. In addition, these approaches are limited in robustness and flexibility. To address these… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: acceped by interspeech 2024

  8. arXiv:2406.04683  [pdf, other

    cs.SD eess.AS

    PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

    Authors: Shuchen Shi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Yi Lu, Xin Qi, Xuefei Liu, Yukun Liu, Yongwei Li, Zhiyong Wang, Xiaopeng Wang

    Abstract: Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description, playing a crucial role in media production. The text descriptions in TTA datasets lack rich variations and diversity, resulting in a drop in TTA model performance when faced with complex text. To address this issue, we propose a method called Portable Plug-in Prompt Refiner, which utilizes rich knowledge abo… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  9. arXiv:2406.04027  [pdf, other

    cs.CR cs.SE

    PowerPeeler: A Precise and General Dynamic Deobfuscation Method for PowerShell Scripts

    Authors: Ruijie Li, Chenyang Zhang, Huajun Chai, Lingyun Ying, Haixin Duan, Jun Tao

    Abstract: PowerShell is a powerful and versatile task automation tool. Unfortunately, it is also widely abused by cyber attackers. To bypass malware detection and hinder threat analysis, attackers often employ diverse techniques to obfuscate malicious PowerShell scripts. Existing deobfuscation tools suffer from the limitation of static analysis, which fails to simulate the real deobfuscation process accurat… ▽ More

    Submitted 19 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: To appear in the ACM CCS 2024

  10. arXiv:2406.03247  [pdf, other

    cs.SD eess.AS

    Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

    Authors: Xiaopeng Wang, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Yuankun Xie, Yukun Liu, Jianhua Tao, Xuefei Liu, Yongwei Li, Xin Qi, Yi Lu, Shuchen Shi

    Abstract: The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new spoofing techniques. Traditional FAD methods often focus solely on distinguishing between genuine and known spoofed audio. We propose a Genuine-Focused Learning (GFL) framework guided, aiming for highly generalized FAD, called GFL-FAD. This method incorporates a Counterfactual Reasoning Enhanced Representation… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  11. arXiv:2406.03240  [pdf, other

    cs.SD cs.AI eess.AS

    Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy

    Authors: Yuankun Xie, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Xiaopeng Wang, Haonnan Cheng, Long Ye, Jianhua Tao

    Abstract: With the proliferation of deepfake audio, there is an urgent need to investigate their attribution. Current source tracing methods can effectively distinguish in-distribution (ID) categories. However, the rapid evolution of deepfake algorithms poses a critical challenge in the accurate identification of out-of-distribution (OOD) novel deepfake algorithms. In this paper, we propose Real Emphasis an… ▽ More

    Submitted 8 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  12. arXiv:2406.03237  [pdf, other

    cs.SD eess.AS

    Generalized Fake Audio Detection via Deep Stable Learning

    Authors: Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi

    Abstract: Although current fake audio detection approaches have achieved remarkable success on specific datasets, they often fail when evaluated with datasets from different distributions. Previous studies typically address distribution shift by focusing on using extra data or applying extra loss restrictions during training. However, these methods either require a substantial amount of data or complicate t… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  13. arXiv:2406.00504  [pdf

    cs.RO cs.AI

    Research on an Autonomous UAV Search and Rescue System Based on the Improved

    Authors: Haobin Chen, Junyu Tao, Bize Zhou, Xiaoyan Liu

    Abstract: The demand is to solve the issue of UAV (unmanned aerial vehicle) operating autonomously and implementing practical functions such as search and rescue in complex unknown environments. This paper proposes an autonomous search and rescue UAV system based on an EGO-Planner algorithm, which is improved by innovative UAV body application and takes the methods of inverse motor backstep** to enhance t… ▽ More

    Submitted 7 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: 2024 5th International Conference on Computer Engineering and Application

  14. arXiv:2405.20914  [pdf, other

    cs.CR

    RASE: Efficient Privacy-preserving Data Aggregation against Disclosure Attacks for IoTs

    Authors: Zuyan Wang, Jun Tao, Dika Zou

    Abstract: The growing popular awareness of personal privacy raises the following quandary: what is the new paradigm for collecting and protecting the data produced by ever-increasing sensor devices. Most previous studies on co-design of data aggregation and privacy preservation assume that a trusted fusion center adheres to privacy regimes. Very recent work has taken steps towards relaxing the assumption by… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 14 pages, 19 figures

  15. arXiv:2405.14913  [pdf, other

    stat.ME cs.LG math.PR stat.ML

    High Rank Path Development: an approach of learning the filtration of stochastic processes

    Authors: Jiajie Tao, Hao Ni, Chong Liu

    Abstract: Since the weak convergence for stochastic processes does not account for the growth of information over time which is represented by the underlying filtration, a slightly erroneous stochastic model in weak topology may cause huge loss in multi-periods decision making problems. To address such discontinuities Aldous introduced the extended weak convergence, which can fully characterise all essentia… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  16. arXiv:2405.10576  [pdf, other

    cs.RO

    An Efficient Learning Control Framework With Sim-to-Real for String-Type Artificial Muscle-Driven Robotic Systems

    Authors: Jiyue Tao, Yunsong Zhang, Sunil Kumar Rajendran, Feitian Zhang, Dexin Zhao, Tongsheng Shen

    Abstract: Robotic systems driven by artificial muscles present unique challenges due to the nonlinear dynamics of actuators and the complex designs of mechanical structures. Traditional model-based controllers often struggle to achieve desired control performance in such systems. Deep reinforcement learning (DRL), a trending machine learning technique widely adopted in robot control, offers a promising alte… ▽ More

    Submitted 7 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  17. arXiv:2405.08596   

    cs.SD eess.AS

    EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark

    Authors: Xiaohui Zhang, Jiangyan Yi, Jianhua Tao

    Abstract: The rise of advanced large language models such as GPT-4, GPT-4o, and the Claude family has made fake audio detection increasingly challenging. Traditional fine-tuning methods struggle to keep pace with the evolving landscape of synthetic speech, necessitating continual learning approaches that can adapt to new audio while retaining the ability to detect older types. Continual learning, which acts… ▽ More

    Submitted 15 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: This paper need more modification

  18. arXiv:2405.05741  [pdf, ps, other

    cs.CL cs.AI

    Can large language models understand uncommon meanings of common words?

    Authors: **yang Wu, Feihu Che, Xinxin Zheng, Shuai Zhang, Ruihan **, Shuai Nie, Pengpeng Shao, Jianhua Tao

    Abstract: Large language models (LLMs) like ChatGPT have shown significant advancements across diverse natural language understanding (NLU) tasks, including intelligent dialogue and autonomous agents. Yet, lacking widely acknowledged testing mechanisms, answering `whether LLMs are stochastic parrots or genuinely comprehend the world' remains unclear, fostering numerous studies and sparking heated debates. P… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  19. arXiv:2405.04880  [pdf, other

    cs.SD cs.AI eess.AS

    The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

    Authors: Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun

    Abstract: With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on… ▽ More

    Submitted 15 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  20. arXiv:2404.18661  [pdf, other

    math.PR

    On the determination of path signature from its unitary development

    Authors: Siran Li, Zijiu Lyu, Hao Ni, Jiajie Tao

    Abstract: We establish an explicit, constructive approach to determine any element $X$ in the tensor algebra $\mathcal{T}\left(\mathbb{R}^d\right) = \bigoplus_{n=0}^\infty\left(\mathbb{R}^d\right)^{\otimes n}$ from its moment generating function. The only assumption is that $X$ has a nonzero radius of convergence, which relaxes the condition of having an infinite radius of convergence in the literature. The… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 24 pages, 8 figures, and 3 tables

    MSC Class: 60L20 (Primary); 62M99; 60G35; 62M07 (Secondary)

  21. arXiv:2404.17113  [pdf, other

    cs.LG cs.HC

    MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition

    Authors: Zheng Lian, Haiyang Sun, Licai Sun, Zhuofan Wen, Siyuan Zhang, Shun Chen, Hao Gu, **ming Zhao, Ziyang Ma, Xie Chen, Jiangyan Yi, Rui Liu, Kele Xu, Bin Liu, Erik Cambria, Guoying Zhao, Björn W. Schuller, Jianhua Tao

    Abstract: Multimodal emotion recognition is an important research topic in artificial intelligence. Over the past few decades, researchers have made remarkable progress by increasing dataset size and building more effective architectures. However, due to various reasons (such as complex environments and inaccurate annotations), current systems are hard to meet the demands of practical applications. Therefor… ▽ More

    Submitted 23 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  22. arXiv:2404.15660  [pdf, other

    cs.CL

    KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering

    Authors: Xinxin Zheng, Feihu Che, **yang Wu, Shuai Zhang, Shuai Nie, Kang Liu, Jianhua Tao

    Abstract: Large language models (LLMs) suffer from the hallucination problem and face significant challenges when applied to knowledge-intensive tasks. A promising approach is to leverage evidence documents as extra supporting knowledge, which can be obtained through retrieval or generation. However, existing methods directly leverage the entire contents of the evidence document, which may introduce noise i… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  23. arXiv:2404.09606  [pdf, other

    cs.LG cs.AI q-bio.QM

    A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions

    Authors: Pengfei Liu, Jun Tao, Zhixiang Ren

    Abstract: The task of chemical reaction predictions (CRPs) plays a pivotal role in advancing drug discovery and material science. However, its effectiveness is constrained by the vast and uncertain chemical reaction space and challenges in capturing reaction selectivity, particularly due to existing methods' limitations in exploiting the data's inherent knowledge. To address these challenges, we introduce a… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  24. arXiv:2404.07454  [pdf, other

    cs.LG cs.NI

    Representation Learning of Tangled Key-Value Sequence Data for Early Classification

    Authors: Tao Duan, Junzhou Zhao, Shuo Zhang, **g Tao, **hui Wang

    Abstract: Key-value sequence data has become ubiquitous and naturally appears in a variety of real-world applications, ranging from the user-product purchasing sequences in e-commerce, to network packet sequences forwarded by routers in networking. Classifying these key-value sequences is important in many scenarios such as user profiling and malicious applications identification. In many time-sensitive sce… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 12 pages, 31 figures, Accepted by ICDE2024

  25. arXiv:2404.03571  [pdf, other

    hep-ex

    Extra Higgs boson searches at the LHC

    Authors: Junquan Tao

    Abstract: Many searches for additional Higgs bosons, which are predicted by a lot of interesting models beyond the standard model, have been performed at the LHC. Some selected latest results of the searches for extra Higgs bosons at the LHC are presented. These additional Higgs bosons could be produced either directly from the parton interactions or from the decays of the observed standard model Higgs boso… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Presented at the 12th Workshop on the CKM Unitarity Triangle, 18-22 September 2023, Santiago de Compostela

  26. arXiv:2404.01089  [pdf, other

    cs.CV cs.AI

    Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

    Authors: Xu Yang, Changxing Ding, Zhibin Hong, Junhao Huang, ** Tao, Xiangmin Xu

    Abstract: Image-based virtual try-on is an increasingly important task for online shop**. It aims to synthesize images of a specific person wearing a specified garment. Diffusion model-based approaches have recently become popular, as they are excellent at image synthesis tasks. However, these approaches usually employ additional image encoders and rely on the cross-attention mechanism for texture transfe… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  27. arXiv:2403.15044  [pdf, other

    cs.CV cs.AI

    Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

    Authors: Zhuofan Wen, Fengyu Zhang, Siyuan Zhang, Haiyang Sun, Mingyu Xu, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao

    Abstract: Multimodal fusion is a significant method for most multimodal tasks. With the recent surge in the number of large pre-trained models, combining both multimodal fusion methods and pre-trained model features can achieve outstanding performance in many multimodal tasks. In this paper, we present our approach, which leverages both advantages for addressing the task of Expression (Expr) Recognition and… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  28. arXiv:2403.03821  [pdf, other

    astro-ph.IM gr-qc

    Identifying Black Holes Through Space Telescopes and Deep Learning

    Authors: Yeqi Fang, Wei Hong, Jun Tao

    Abstract: The EHT has captured a series of images of black holes. These images could provide valuable information about the gravitational environment near the event horizon. However, accurate detection and parameter estimation for candidate black holes are necessary. This paper explores the potential for identifying black holes in the ultraviolet band using space telescopes. We establish a data pipeline for… ▽ More

    Submitted 11 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: 18 pages, 18 figures, 8 tables. We propose a ensemble neural network which demonstrates the feasibility of detecting black holes in the UV band and provides a new method for the accurate and real-time detection of candidate black holes and further parameter estimation

  29. arXiv:2403.01318  [pdf, other

    stat.ML cs.LG econ.EM

    High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

    Authors: Yuya Sasaki, **g Tao, Yulong Wang

    Abstract: Motivated by the empirical power law of the distributions of credits (e.g., the number of "likes") of viral posts in social media, we introduce the high-dimensional tail index regression and methods of estimation and inference for its parameters. We propose a regularized estimator, establish its consistency, and derive its convergence rate. To conduct inference, we propose to debias the regularize… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  30. arXiv:2402.11432  [pdf, other

    cs.CL

    Can Deception Detection Go Deeper? Dataset, Evaluation, and Benchmark for Deception Reasoning

    Authors: Kang Chen, Zheng Lian, Haiyang Sun, Bin Liu, Jianhua Tao

    Abstract: Deception detection has attracted increasing attention due to its importance in real-world scenarios. Its main goal is to detect deceptive behaviors from multimodal clues such as gestures, facial expressions, prosody, etc. However, these bases are usually subjective and related to personal habits. Therefore, we extend deception detection to deception reasoning, further providing objective evidence… ▽ More

    Submitted 16 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  31. arXiv:2402.11082  [pdf, other

    cs.CR cs.AI

    The AI Security Pyramid of Pain

    Authors: Chris M. Ward, Josh Harguess, Julia Tao, Daniel Christman, Paul Spicer, Mike Tan

    Abstract: We introduce the AI Security Pyramid of Pain, a framework that adapts the cybersecurity Pyramid of Pain to categorize and prioritize AI-specific threats. This framework provides a structured approach to understanding and addressing various levels of AI threats. Starting at the base, the pyramid emphasizes Data Integrity, which is essential for the accuracy and reliability of datasets and AI models… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: SPIE DCS 2024

  32. arXiv:2402.04119  [pdf, other

    cs.LG cs.CE

    Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science

    Authors: Pengfei Liu, Jun Tao, Zhixiang Ren

    Abstract: Efficient molecular modeling and design are crucial for the discovery and exploration of novel molecules, and the incorporation of deep learning methods has revolutionized this field. In particular, large language models (LLMs) offer a fresh approach to tackle scientific problems from a natural language processing (NLP) perspective, introducing a research paradigm called scientific language modeli… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  33. Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion

    Authors: Cunhang Fan, Yujie Chen, Jun Xue, Yonghui Kong, Jianhua Tao, Zhao Lv

    Abstract: In recent years, knowledge graph completion (KGC) models based on pre-trained language model (PLM) have shown promising results. However, the large number of parameters and high computational cost of PLM models pose challenges for their application in downstream tasks. This paper proposes a progressive distillation method based on masked generation features for KGC task, aiming to significantly re… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024

    Journal ref: (2024) Vol. 38 No. 8: AAAI-24 Technical Tracks 8 Vol. 38 No. 8: AAAI-24 Technical Tracks 8 Vol. 38 No. 8: AAAI-24 Technical Tracks 8 Proceedings of the AAAI Conference on Artificial Intelligence, 38(8), 8380-8388

  34. arXiv:2401.10273  [pdf

    cs.CY cs.AI

    Revolutionizing Pharma: Unveiling the AI and LLM Trends in the Pharmaceutical Industry

    Authors: Yu Han, **gwen Tao

    Abstract: This document offers a critical overview of the emerging trends and significant advancements in artificial intelligence (AI) within the pharmaceutical industry. Detailing its application across key operational areas, including research and development, animal testing, clinical trials, hospital clinical stages, production, regulatory affairs, quality control and other supporting areas, the paper ca… ▽ More

    Submitted 21 January, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  35. arXiv:2401.09750  [pdf, other

    cs.LG

    Exploration and Anti-Exploration with Distributional Random Network Distillation

    Authors: Kai Yang, Jian Tao, Jiafei Lyu, Xiu Li

    Abstract: Exploration remains a critical issue in deep reinforcement learning for an agent to attain high returns in unknown environments. Although the prevailing exploration Random Network Distillation (RND) algorithm has been demonstrated to be effective in numerous environments, it often needs more discriminative power in bonus allocation. This paper highlights the "bonus inconsistency" issue within RND,… ▽ More

    Submitted 19 May, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: ICML 2024 accepted

  36. arXiv:2401.05698  [pdf, other

    cs.CV cs.HC cs.MM cs.SD eess.AS

    HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition

    Authors: Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao

    Abstract: Audio-Visual Emotion Recognition (AVER) has garnered increasing attention in recent years for its critical role in creating emotion-ware intelligent machines. Previous efforts in this area are dominated by the supervised learning paradigm. Despite significant progress, supervised learning is meeting its bottleneck due to the longstanding data scarcity issue in AVER. Motivated by recent advances in… ▽ More

    Submitted 1 April, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted by Information Fusion. The code is available at https://github.com/sunlicai/HiCMAE

    Journal ref: Information Fusion, 2024

  37. arXiv:2401.03429  [pdf, other

    cs.HC

    MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition

    Authors: Zheng Lian, Licai Sun, Yong Ren, Hao Gu, Haiyang Sun, Lan Chen, Bin Liu, Jianhua Tao

    Abstract: Multimodal emotion recognition plays a crucial role in enhancing user experience in human-computer interaction. Over the past few decades, researchers have proposed a series of algorithms and achieved impressive progress. Although each method shows its superior performance, different methods lack a fair comparison due to inconsistencies in feature extractors, evaluation manners, and experimental s… ▽ More

    Submitted 20 April, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  38. arXiv:2401.02393  [pdf, ps, other

    math.PR math.AP

    A PDE approach for solving the characteristic function of the generalised signature process

    Authors: Terry Lyons, Hao Ni, Jiajie Tao

    Abstract: The signature of a path, as a fundamental object in Rough path theory, serves as a generating function for non-commutative monomials on path space. It transforms the path into a grouplike element in the tensor algebra space, summarising the path faithfully up to a generalised form of re-parameterisation (a negligible equivalence class in this context). Our paper concerns stochastic processes and s… ▽ More

    Submitted 29 February, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  39. arXiv:2401.00416  [pdf, other

    cs.CV cs.HC cs.MM

    SVFAP: Self-supervised Video Facial Affect Perceiver

    Authors: Licai Sun, Zheng Lian, Kexin Wang, Yu He, Mingyu Xu, Haiyang Sun, Bin Liu, Jianhua Tao

    Abstract: Video-based facial affect analysis has recently attracted increasing attention owing to its critical role in human-computer interaction. Previous studies mainly focus on develo** various deep learning architectures and training them in a fully supervised manner. Although significant progress has been achieved by these supervised methods, the longstanding lack of large-scale high-quality labeled… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: Submitted to IEEE Trans. on Affective Computing (February 8, 2023)

  40. arXiv:2312.15760  [pdf, other

    gr-qc astro-ph.CO astro-ph.GA

    Gravitational Lensing of Spherically Symmetric Black Holes in Dark Matter Halos

    Authors: Yi-Gao Liu, Chen-Kai Qiao, Jun Tao

    Abstract: The gravitational lensing of supermassive black holes surrounded by dark matter halo has attracted a great number of interests in recent years. However, many studies employed simplified dark matter density models, which makes it very hard to give a precise prediction on the dark matter effects in real astrophysical galaxies. In this work, to more accurately describe the distribution of dark matter… ▽ More

    Submitted 25 May, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

    Comments: 29 pages, 9 figures, 2 appendices

  41. arXiv:2312.15583  [pdf, other

    cs.MM

    ITEACH-Net: Inverted Teacher-studEnt seArCH Network for Emotion Recognition in Conversation

    Authors: Haiyang Sun, Zheng Lian, Chenglong Wang, Kang Chen, Licai Sun, Bin Liu, Jianhua Tao

    Abstract: There remain two critical challenges that hinder the development of ERC. Firstly, there is a lack of exploration into mining deeper insights from the data itself for conversational emotion tasks. Secondly, the systems exhibit vulnerability to random modality feature missing, which is a common occurrence in realistic settings. Focusing on these two key challenges, we propose a novel framework for i… ▽ More

    Submitted 1 June, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

  42. arXiv:2312.15258  [pdf, other

    cs.CV

    Human101: Training 100+FPS Human Gaussians in 100s from 1 View

    Authors: Mingwei Li, Jiachen Tao, Zongxin Yang, Yi Yang

    Abstract: Reconstructing the human body from single-view videos plays a pivotal role in the virtual reality domain. One prevalent application scenario necessitates the rapid reconstruction of high-fidelity 3D digital humans while simultaneously ensuring real-time rendering and interaction. Existing methods often struggle to fulfill both requirements. In this paper, we introduce Human101, a novel framework a… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Website: https://github.com/longxiang-ai/Human101

  43. arXiv:2312.14944  [pdf

    cond-mat.mtrl-sci cond-mat.str-el

    Surface termination effect of SrTiO3 substrate on ultrathin SrRuO3

    Authors: Huiyu Wang, Zhen Wang, Zeeshan Ali, Enling Wang, Mohammad Saghayezhian, Jiandong Guo, Yimei Zhu, **g Tao, Jiandi Zhang

    Abstract: A uniform one-unit-cell-high step on the SrTiO3 substrate is a prerequisite for growing high-quality epitaxial oxide heterostructures. However, it is inevitable that defects induced by mixed substrate surface termination exist at the interface, significantly impacting the properties of ultrathin films. In this study, we microscopically identify the origin for the lateral inhomogeneity in the growt… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 19 pages, 10 figures, 30 references

  44. arXiv:2312.11912  [pdf, other

    gr-qc

    Probing the thermodynamics of charged Gauss Bonnet AdS black holes with the Lyapunov exponent

    Authors: Xin Lyu, Jun Tao, Peng Wang

    Abstract: In this paper, we investigate the thermodynamic properties of charged AdS Gauss-Bonnet black holes and the associations with the Lyapunov exponent. The chaotic features of the black holes and the isobaric heat capacity characterized by Lyapunov exponent are studied to reveal the stability of black hole phases. With the consideration of both timelike and null geodesic, we find the relationship betw… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Report number: CTP-SCU/2023039

  45. arXiv:2312.09651  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection

    Authors: Xiaohui Zhang, Jiangyan Yi, Chenglong Wang, Chuyuan Zhang, Siding Zeng, Jianhua Tao

    Abstract: The rapid evolution of speech synthesis and voice conversion has raised substantial concerns due to the potential misuse of such technology, prompting a pressing need for effective audio deepfake detection mechanisms. Existing detection models have shown remarkable success in discriminating known deepfake audio, but struggle when encountering new attack types. To address this challenge, one of the… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by the main track The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  46. arXiv:2312.04293  [pdf, other

    cs.CV cs.MM

    GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition

    Authors: Zheng Lian, Licai Sun, Haiyang Sun, Kang Chen, Zhuofan Wen, Hao Gu, Bin Liu, Jianhua Tao

    Abstract: Recently, GPT-4 with Vision (GPT-4V) has demonstrated remarkable visual capabilities across various tasks, but its performance in emotion recognition has not been fully evaluated. To bridge this gap, we present the quantitative evaluation results of GPT-4V on 21 benchmark datasets covering 6 tasks: visual sentiment analysis, tweet sentiment analysis, micro-expression recognition, facial emotion re… ▽ More

    Submitted 17 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  47. arXiv:2312.02243  [pdf, other

    cs.LG

    FlowHON: Representing Flow Fields Using Higher-Order Networks

    Authors: Nan Chen, Zhihong Li, Jun Tao

    Abstract: Flow fields are often partitioned into data blocks for massively parallel computation and analysis based on blockwise relationships. However, most of the previous techniques only consider the first-order dependencies among blocks, which is insufficient in describing complex flow patterns. In this work, we present FlowHON, an approach to construct higher-order networks (HONs) from flow fields. Flow… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: To be submitted to TVCG

  48. arXiv:2311.13231  [pdf, other

    cs.LG cs.AI cs.CV

    Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

    Authors: Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li

    Abstract: Using reinforcement learning with human feedback (RLHF) has shown significant promise in fine-tuning diffusion models. Previous methods start by training a reward model that aligns with human preferences, then leverage RL techniques to fine-tune the underlying models. However, crafting an efficient reward model demands extensive datasets, optimal architecture, and manual hyperparameter tuning, mak… ▽ More

    Submitted 23 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: CVPR 2024 accepted; huggingface daily paper

  49. arXiv:2311.11606  [pdf, other

    hep-th

    Topology of Hořava-Lifshitz black holes in different ensembles

    Authors: Deyou Chen, Yucheng He, Jun Tao, Wei Yang

    Abstract: In this paper, we study topological numbers for uncharged and charged static black holes obtained in Hořava-Lifshitz gravity theory in different ensembles. We first calculate the topological numbers for the uncharged black holes by changing the value of the dynamic coupling constant, and find that the black holes with spherical and flat horizons have the same topological number. When the black hol… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 17 pages, 13 figures

    Journal ref: Eur. Phys. J. C (2024) 84:96

  50. arXiv:2311.11586  [pdf, other

    gr-qc hep-th

    Attractive interactions in the microstructures of asymptotically flat black holes

    Authors: Deyou Chen, Jun Tao, Xuetao Yang

    Abstract: In this work, we investigate the microstructure of asymptotically flat black holes with Ruppeiner curvature. Specially, the cosmological constant is considered to have a fluctuation around 0. Under such consideration, both repulsive and attractive interactions are found in the Reissner-Nordström and Kerr black holes, while the Schwarzschild black hole has dominant attractive interaction. The resul… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 17 pages, 3 figures

    Journal ref: Physics of the Dark Universe, 42 (2023) 101379