Skip to main content

Showing 1–50 of 1,038 results for author: Xiao, H

.
  1. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  2. arXiv:2406.18993  [pdf, ps, other

    eess.SP

    Interference Cancellation Based Neural Receiver for Superimposed Pilot in Multi-Layer Transmission

    Authors: Han Xiao, Wenqiang Tian, Shi **, Wendong Liu, Jia Shen, Zhihua Shi, Zhi Zhang

    Abstract: In this paper, an interference cancellation based neural receiver for superimposed pilot (SIP) in multi-layer transmission is proposed, where the data and pilot are non-orthogonally superimposed in the same time-frequency resource. Specifically, to deal with the intra-layer and inter-layer interference of SIP under multi-layer transmission, the interference cancellation with superimposed symbol ai… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.18583  [pdf, other

    cs.CV cs.LG

    Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT

    Authors: Le Zhuo, Ruoyi Du, Han Xiao, Yangguang Li, Dongyang Liu, Rongjie Huang, Wenze Liu, Lirui Zhao, Fu-Yun Wang, Zhanyu Ma, Xu Luo, Zehan Wang, Kaipeng Zhang, Xiangyang Zhu, Si Liu, Xiangyu Yue, Dingning Liu, Wanli Ouyang, Ziwei Liu, Yu Qiao, Hongsheng Li, Peng Gao

    Abstract: Lumina-T2X is a nascent family of Flow-based Large Diffusion Transformers that establishes a unified framework for transforming noise into various modalities, such as images and videos, conditioned on text instructions. Despite its promising capabilities, Lumina-T2X still encounters challenges including training instability, slow inference, and extrapolation artifacts. In this paper, we present Lu… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Code at: https://github.com/Alpha-VLLM/Lumina-T2X

  4. arXiv:2406.17465  [pdf, other

    cs.CL cs.AI

    Enhancing Tool Retrieval with Iterative Feedback from Large Language Models

    Authors: Qiancheng Xu, Yongqi Li, Heming Xia, Wenjie Li

    Abstract: Tool learning aims to enhance and expand large language models' (LLMs) capabilities with external tools, which has gained significant attention recently. Current methods have shown that LLMs can effectively handle a certain amount of tools through in-context learning or fine-tuning. However, in real-world scenarios, the number of tools is typically extensive and irregularly updated, emphasizing th… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  5. arXiv:2406.17446  [pdf, other

    physics.flu-dyn

    Data-Driven Turbulence Modeling Approach for Cold-Wall Hypersonic Boundary Layers

    Authors: Muhammad I. Zafar, Xuhui Zhou, Christopher J. Roy, David Stelter, Heng Xiao

    Abstract: Wall-cooling effect in hypersonic boundary layers can significantly alter the near-wall turbulence behavior, which is not accurately modeled by traditional RANS turbulence models. To address this shortcoming, this paper presents a turbulence modeling approach for hypersonic flows with cold-wall conditions using an iterative ensemble Kalman method. Specifically, a neural-network-based turbulence mo… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.16125  [pdf, other

    cs.CR cs.AI

    CBPF: Filtering Poisoned Data Based on Composite Backdoor Attack

    Authors: Hanfeng Xia, Haibo Hong, Ruili Wang

    Abstract: Backdoor attacks involve the injection of a limited quantity of poisoned examples containing triggers into the training dataset. During the inference stage, backdoor attacks can uphold a high level of accuracy for normal examples, yet when presented with trigger-containing instances, the model may erroneously predict them as the targeted class designated by the attacker. This paper explores strate… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  7. arXiv:2406.15962  [pdf

    cs.LG cs.CR cs.ET

    Privacy Preserving Machine Learning for Electronic Health Records using Federated Learning and Differential Privacy

    Authors: Naif A. Ganadily, Han J. Xia

    Abstract: An Electronic Health Record (EHR) is an electronic database used by healthcare providers to store patients' medical records which may include diagnoses, treatments, costs, and other personal information. Machine learning (ML) algorithms can be used to extract and analyze patient data to improve patient care. Patient records contain highly sensitive information, such as social security numbers (SSN… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 5 pages, 12 figures

  8. arXiv:2406.14877  [pdf, other

    cs.CL

    Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video

    Authors: Zhengbang Yang, Haotian Xia, **gxi Li, Zezhi Chen, Zhuangdi Zhu, Weining Shen

    Abstract: Understanding sports is crucial for the advancement of Natural Language Processing (NLP) due to its intricate and dynamic nature. Reasoning over complex sports scenarios has posed significant challenges to current NLP technologies which require advanced cognitive capabilities. Toward addressing the limitations of existing benchmarks on sports understanding in the NLP field, we extensively evaluate… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  9. arXiv:2406.14841  [pdf, other

    cs.CR cs.DB cs.LG

    TabularMark: Watermarking Tabular Datasets for Machine Learning

    Authors: Yihao Zheng, Haocheng Xia, Junyuan Pang, **fei Liu, Kui Ren, Lingyang Chu, Yang Cao, Li Xiong

    Abstract: Watermarking is broadly utilized to protect ownership of shared data while preserving data utility. However, existing watermarking methods for tabular datasets fall short on the desired properties (detectability, non-intrusiveness, and robustness) and only preserve data utility from the perspective of data statistics, ignoring the performance of downstream ML models trained on the datasets. Can we… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  10. arXiv:2406.12970  [pdf, other

    hep-ph astro-ph.CO

    Warm and Fuzzy Dark Matter: Free Streaming of Wave Dark Matter

    Authors: Rayne Liu, Wayne Hu, Huangyu Xiao

    Abstract: Wave or fuzzy dark matter that is produced with relativistic wavenumbers exhibits free streaming effects analogous to warm or hot particle dark matter with relativistic momenta. Axions produced after inflation provide such a warm or mildly relativistic candidate, where the enhanced suppression and observational bounds are only moderately stronger than that from wave propagation of initially cold a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 16 pages, 11 figures

    Report number: FERMILAB-PUB-24-0296-T

  11. arXiv:2406.12754  [pdf, other

    cs.CL cs.AI

    Chumor 1.0: A Truly Funny and Challenging Chinese Humor Understanding Dataset from Ruo Zhi Ba

    Authors: Ruiqi He, Yushu He, Longju Bai, Jiarui Liu, Zhenjie Sun, Zenghao Tang, He Wang, Hanchen Xia, Naihao Deng

    Abstract: Existing humor datasets and evaluations predominantly focus on English, lacking resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, a dataset sourced from Ruo Zhi Ba (RZB), a Chinese Reddit-like platform dedicated to sharing intellectually challenging and culturally specific jokes. We annotate explanations for each joke and evalua… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  12. arXiv:2406.12255  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

    Authors: Lijie Hu, Liang Liu, Shu Yang, Xin Chen, Hongru Xiao, Mengdi Li, Pan Zhou, Muhammad Asif Ali, Di Wang

    Abstract: Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs). While some studies focus on improving CoT accuracy through methods like retrieval enhancement, yet a rigorous explanation for why CoT achieves such success remains unclear. In this paper, we analyze CoT methods under two different settings by asking the following questions: (1… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 21 pages

  13. arXiv:2406.12252  [pdf, other

    cs.CL

    Language and Multimodal Models in Sports: A Survey of Datasets and Applications

    Authors: Haotian Xia, Zhengbang Yang, Yun Zhao, Yuqing Wang, **gxi Li, Rhys Tracy, Zhuangdi Zhu, Yuan-fang Wang, Hanjie Chen, Weining Shen

    Abstract: Recent integration of Natural Language Processing (NLP) and multimodal models has advanced the field of sports analytics. This survey presents a comprehensive review of the datasets and applications driving these innovations post-2020. We overviewed and categorized datasets into three primary types: language-based, multimodal, and convertible datasets. Language-based and multimodal datasets are fo… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  14. arXiv:2406.10985  [pdf, other

    cs.CL

    Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens

    Authors: Weiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui

    Abstract: Large language models (LLMs) have shown promising efficacy across various tasks, becoming powerful tools in numerous aspects of human life. However, Transformer-based LLMs suffer a performance degradation when modeling long-term contexts due to they discard some information to reduce computational overhead. In this work, we propose a simple yet effective method to enable LLMs to take a deep breath… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  15. arXiv:2406.10637  [pdf

    cond-mat.mes-hall

    Photonic realization of chiral hinge states in a Chern-insulator stack

    Authors: Han-Rong Xia, Jia-Zheng Li, Si-Yu Yuan, Meng Xiao

    Abstract: Higher-order topological insulators, as a novel family of topological phases, are a hot frontier in condensed matter physics due to their adherence to unconventional bulk-boundary correspondence. A three-dimensional second-order topological insulator can support one-dimensional modes along its hinges (dubbed as hinge states). Here, we present a simple and direct method to construct chiral hinge mo… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  16. arXiv:2406.10596  [pdf, ps, other

    math.RA

    Twisting of Lie triple systems, $L_\infty$-algebras, and (generalized) matched pairs

    Authors: Jia Zhao, Haobo Xia

    Abstract: In this paper, we introduce notions of (proto-, quasi-)twilled Lie triple systems and give their equivalent descriptions using the controlling algebra and bidegree convention. Then we construct an $L_\infty$-algebra via a twilled Lie triple system. Besides, we establish the twisting theory of Lie triple systems and then characterize the twisting as a Maurer-Cartan element in the constructed… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  17. arXiv:2406.09499  [pdf, other

    hep-ph astro-ph.CO

    Axion Stars: Mass Functions and Constraints

    Authors: Jae Hyeok Chang, Patrick J. Fox, Huangyu Xiao

    Abstract: The QCD axion and axion-like particles, as leading dark matter candidates, can also have interesting implications for dark matter substructures if the Peccei-Quinn symmetry is broken after inflation. In such a scenario, axion perturbations on small scales will lead to the formation of axion miniclusters at matter-radiation equality, and subsequently the formation of axion stars. Such compact objec… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 20 pages + refs, 5 figures

    Report number: FERMILAB-PUB-24-0295-T

  18. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, **ming Guo, Xiaolin Chen, **gcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  19. arXiv:2406.08754  [pdf, other

    cs.CL cs.CR

    StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Encoded Structure

    Authors: Bangxin Li, Hengrui Xing, Chao Huang, ** Qian, Huangqing Xiao, Linfeng Feng, Cong Tian

    Abstract: Large Language Models (LLMs) are widely used in natural language processing but face the risk of jailbreak attacks that maliciously induce them to generate harmful content. Existing jailbreak attacks, including character-level and context-level attacks, mainly focus on the prompt of the plain text without specifically exploring the significant influence of its structure. In this paper, we focus on… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures

  20. arXiv:2406.06985  [pdf

    q-bio.GN

    pVACview: an interactive visualization tool for efficient neoantigen prioritization and selection

    Authors: Huiming Xia, My Hoang, Evelyn Schmidt, Susanna Kiwala, Joshua McMichael, Zachary L. Skidmore, Bryan Fisk, Jonathan J. Song, Jasreet Hundal, Thomas Mooney, Jason R. Walker, S. Peter Goedegebuure, Christopher A. Miller, William E. Gillanders, Obi L. Griffith, Malachi Griffith

    Abstract: Neoantigen targeting therapies including personalized vaccines have shown promise in the treatment of cancers. Accurate identification/prioritization of neoantigens is highly relevant to designing clinical trials, predicting treatment response, and understanding mechanisms of resistance. With the advent of massively parallel sequencing technologies, it is now possible to predict neoantigens based… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Supplemental tables available at 10.5281/zenodo.11534338

  21. arXiv:2406.05540  [pdf, other

    q-bio.QM cs.AI cs.CL cs.LG

    A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding

    Authors: Yiqing Shen, Zan Chen, Michail Mamalakis, Luhan He, Haiyang Xia, Tianbin Li, Yanzhou Su, Junjun He, Yu Guang Wang

    Abstract: The parallels between protein sequences and natural language in their sequential structures have inspired the application of large language models (LLMs) to protein understanding. Despite the success of LLMs in NLP, their effectiveness in comprehending protein sequences remains an open question, largely due to the absence of datasets linking protein sequences to descriptive text. Researchers have… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  22. arXiv:2406.03059  [pdf, other

    cs.LG

    Efficient Exploration of the Rashomon Set of Rule Set Models

    Authors: Martino Ciaperoni, Han Xiao, Aristides Gionis

    Abstract: Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a partial representation of a learning task. An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance.… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  23. arXiv:2405.20204  [pdf, other

    cs.CL cs.AI cs.CV cs.IR

    **a CLIP: Your CLIP Model Is Also Your Text Retriever

    Authors: Andreas Koukounas, Georgios Mastrapas, Michael Günther, Bo Wang, Scott Martens, Isabelle Mohr, Saba Sturua, Mohammad Kalim Akram, Joan Fontanals Martínez, Saahil Ognawala, Susana Guzman, Maximilian Werk, Nan Wang, Han Xiao

    Abstract: Contrastive Language-Image Pretraining (CLIP) is widely used to train models to align images and texts in a common embedding space by map** them to fixed-sized vectors. These models are key to multimodal information retrieval and related tasks. However, CLIP models generally underperform in text-only tasks compared to specialized text models. This creates inefficiencies for information retrieval… ▽ More

    Submitted 26 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 4 pages, MFM-EAI@ICML2024

    MSC Class: 68T50 ACM Class: I.2.7

  24. arXiv:2405.17934  [pdf, other

    cs.AI

    Proof of Quality: A Costless Paradigm for Trustless Generative AI Model Inference on Blockchains

    Authors: Zhenjie Zhang, Yuyang Rao, Hao Xiao, Xiaokui Xiao, Yin Yang

    Abstract: Generative AI models, such as GPT-4 and Stable Diffusion, have demonstrated powerful and disruptive capabilities in natural language and image tasks. However, deploying these models in decentralized environments remains challenging. Unlike traditional centralized deployment, systematically guaranteeing the integrity of AI model services in fully decentralized environments, particularly on trustles… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 12 pages, 5 figures

  25. arXiv:2405.17054  [pdf, other

    cs.LG

    Improving Data-aware and Parameter-aware Robustness for Continual Learning

    Authors: Hanxi Xiao, Fan Lyu

    Abstract: The goal of Continual Learning (CL) task is to continuously learn multiple new tasks sequentially while achieving a balance between the plasticity and stability of new and old knowledge. This paper analyzes that this insufficiency arises from the ineffective handling of outliers, leading to abnormal gradients and unexpected model updates. To address this issue, we enhance the data-aware and parame… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  26. arXiv:2405.14543  [pdf

    physics.soc-ph

    Initial Burst of Disruptive Efforts Ensuring Scientific Career Viability

    Authors: Shuang Zhang, Feifan Liu, Haoxiang Xia

    Abstract: Despite persistent efforts to understand the dynamics of creativity of scientists over careers in terms of productivity, impact, and prize, little is known about the dynamics of scientists' disruptive efforts that affect individual academic careers and drive scientific advance. Drawing on millions of data over six decades and across nineteen disciplines, associating the publication records of indi… ▽ More

    Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  27. arXiv:2405.14480  [pdf, other

    cs.CV

    Scalable Visual State Space Model with Fractal Scanning

    Authors: Lv Tang, HaoKe Xiao, Peng-Tao Jiang, Hao Zhang, **wei Chen, Bo Li

    Abstract: Foundational models have significantly advanced in natural language processing (NLP) and computer vision (CV), with the Transformer architecture becoming a standard backbone. However, the Transformer's quadratic complexity poses challenges for handling longer sequences and higher resolution images. To address this challenge, State Space Models (SSMs) like Mamba have emerged as efficient alternativ… ▽ More

    Submitted 26 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: This paper is working in progress

  28. arXiv:2405.12430  [pdf, other

    astro-ph.HE

    Studying magnetic reconnection with synchrotron polarization statistics

    Authors: Jian-Fu Zhang, Shi-Min Liang, Hua-** Xiao

    Abstract: Magnetic reconnection is a fundamental process for releasing magnetic energy in space physics and astrophysics. At present, the usual way to investigate the reconnection process is through analytical studies or first-principles numerical simulations. This paper is the first to understand the turbulent magnetic reconnection process by exploring the nature of magnetic turbulence. From the perspectiv… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 12 pages, 7 figures, 1 table. Accepted for publication in ApJ

  29. arXiv:2405.11893  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Tunable moiré bandgap in hBN-aligned bilayer graphene device with in-situ electrostatic gating

    Authors: Hanbo Xiao, Han Gao, Min Li, Fanqiang Chen, Qiao Li, Yiwei Li, Meixiao Wang, Fangyuan Zhu, Lexian Yang, Feng Miao, Yulin Chen, Cheng Chen, Bin Cheng, Jianpeng Liu, Zhongkai Liu

    Abstract: Over the years, great efforts have been devoted in introducing a sizable and tunable band gap in graphene for its potential application in next-generation electronic devices. The primary challenge in modulating this gap has been the absence of a direct method for observing changes of the band gap in momentum space. In this study, we employ advanced spatial- and angle-resolved photoemission spectro… ▽ More

    Submitted 24 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 16 pages,4 figures

  30. arXiv:2405.10577  [pdf, other

    cs.CV cs.RO

    DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection

    Authors: Zhe Huang, Yizhe Zhao, Hao Xiao, Chenyan Wu, Lingting Ge

    Abstract: Recent advances in multi-view camera-only 3D object detection either rely on an accurate reconstruction of bird's-eye-view (BEV) 3D features or on traditional 2D perspective view (PV) image features. While both have their own pros and cons, few have found a way to stitch them together in order to benefit from "the best of both worlds". To this end, we explore a duo space (i.e., BEV and PV) 3D perc… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  31. arXiv:2405.09066  [pdf, other

    hep-ex

    Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

    Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 14 pages, 7 figures

  32. arXiv:2405.08603  [pdf, other

    cs.CL

    A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine

    Authors: Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang

    Abstract: Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal large language models (MLLMs) have garnered significant attention due to their powerful and general capabilities in understanding, reasoning, and generation, thereby offering new paradigms for the integration of artificial intelligence with medicine. This survey comprehensively overviews the development background… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  33. arXiv:2405.04929  [pdf, ps, other

    cs.IR

    Enabling Roll-up and Drill-down Operations in News Exploration with Knowledge Graphs for Due Diligence and Risk Management

    Authors: Sha Wang, Yuchen Li, Hanhua Xiao, Zhifeng Bao, Lambert Deng, Yanfei Dong

    Abstract: Efficient news exploration is crucial in real-world applications, particularly within the financial sector, where numerous control and risk assessment tasks rely on the analysis of public news reports. The current processes in this domain predominantly rely on manual efforts, often involving keywordbased searches and the compilation of extensive keyword lists. In this paper, we introduce NCEXPLORE… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: The paper was accepted by ICDE 2024

  34. arXiv:2405.04483  [pdf, other

    physics.ao-ph

    CloudDiff: Super-resolution ensemble retrieval of cloud properties for all day using the generative diffusion model

    Authors: Haixia Xiao, Feng Zhang, Lingxiao Wang, Wenwen Li, Bin Guo, Jun Li

    Abstract: Clouds play a crucial role in the Earth's water and energy cycles, underscoring the importance of high spatiotemporal resolution data on cloud phase and properties for accurate numerical modeling and weather prediction. Currently, Moderate Resolution Imaging Spectroradiometer (MODIS) provides cloud products with a spatial resolution of 1 km. However, these products suffer from a lengthy revisit cy… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Report number: RIKEN-iTHEMS-Report-2024

  35. arXiv:2405.01461  [pdf, other

    cs.CV

    SATO: Stable Text-to-Motion Framework

    Authors: Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen

    Abstract: Is the Text to Motion model robust? Recent advancements in Text to Motion models primarily stem from more accurate predictions of specific actions. However, the text modality typically relies solely on pre-trained Contrastive Language-Image Pretraining (CLIP) models. Our research has uncovered a significant issue with the text-to-motion model: its predictions often exhibit inconsistent outputs, re… ▽ More

    Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  36. arXiv:2405.00256  [pdf, other

    cs.CV

    ASAM: Boosting Segment Anything Model with Adversarial Tuning

    Authors: Bo Li, Haoke Xiao, Lv Tang

    Abstract: In the evolving landscape of computer vision, foundation models have emerged as pivotal tools, exhibiting exceptional adaptability to a myriad of tasks. Among these, the Segment Anything Model (SAM) by Meta AI has distinguished itself in image segmentation. However, SAM, like its counterparts, encounters limitations in specific niche applications, prompting a quest for enhancement strategies that… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: This paper is accepted by CVPR2024

  37. arXiv:2404.19383  [pdf, other

    cs.CV

    Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition

    Authors: Zhendong Liu, Haifeng Xia, Tong Guo, Libo Sun, Ming Shao, Siyu Xia

    Abstract: Human action video recognition has recently attracted more attention in applications such as video security and sports posture correction. Popular solutions, including graph convolutional networks (GCNs) that model the human skeleton as a spatiotemporal graph, have proven very effective. GCNs-based methods with stacked blocks usually utilize top-layer semantics for classification/annotation purpos… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  38. arXiv:2404.19038  [pdf, other

    cs.CV cs.AI

    Embedded Representation Learning Network for Animating Styled Video Portrait

    Authors: Tianyong Wang, Xiangyu Liang, Wangguandong Zheng, Dan Niu, Haifeng Xia, Siyu Xia

    Abstract: The talking head generation recently attracted considerable attention due to its widespread application prospects, especially for digital avatars and 3D animation design. Inspired by this practical demand, several works explored Neural Radiance Fields (NeRF) to synthesize the talking heads. However, these methods based on NeRF face two challenges: (1) Difficulty in generating style-controllable ta… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  39. arXiv:2404.18604  [pdf, other

    cs.CV cs.AI

    CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

    Authors: Xiangyu Liang, Wenlin Zhuang, Tianyong Wang, Guangxing Geng, Guangyue Geng, Haifeng Xia, Siyu Xia

    Abstract: Speech-driven 3D facial animation technology has been developed for years, but its practical application still lacks expectations. The main challenges lie in data limitations, lip alignment, and the naturalness of facial expressions. Although lip alignment has seen many related studies, existing methods struggle to synthesize natural and realistic expressions, resulting in a mechanical and stiff a… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  40. arXiv:2404.16310  [pdf, other

    astro-ph.GA astro-ph.HE astro-ph.IM

    Measurement of Interstellar Magnetization by Synchrotron Polarization Variance

    Authors: Ning-Ning Guo, Jian-Fu Zhang, Hua-** Xiao, Jungyeon Cho, Xue-Juan Yang

    Abstract: Since synchrotron polarization fluctuations are related to the fundamental properties of the magnetic field, we propose the polarization intensity variance to measure the Galactic interstellar medium (ISM) magnetization. We confirm the method's applicability by comparing it with the polarization angle dispersion and its reliability by measuring the underlying Alfvénic Mach number of MHD turbulence… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures, and 2 tables. Accepted for publication in ApJ

  41. arXiv:2404.12254  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Omnidirectional 3D printing of PEDOT:PSS aerogels with tunable electromechanical performance for unconventional stretchable interconnects and thermoelectrics

    Authors: Hasan Emre Baysal, Tzu-Yi Yu, Viktor Naenen, Stijn De Smedt, Defne Hiz, Bokai Zhang, Heyi Xia, Isidro Florenciano, Martin Rosenthal, Ruth Cardinaels, Francisco Molina-Lopez

    Abstract: The next generation of soft electronics will expand to the third dimension. This will require the integration of mechanically-compliant three-dimensional functional structures with stretchable materials. This study demonstrates omnidirectional direct ink writing (DIW) of Poly(3,4-ethylenedioxythiophene) polystyrene sulfonate (PEDOT:PSS) aerogels with tunable electrical and mechanical performance,… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  42. arXiv:2404.10210  [pdf, other

    cs.CV

    MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition

    Authors: Naichuan Zheng, Hailun Xia, Zeyu Liang

    Abstract: In recent years, skeleton-based action recognition, leveraging multimodal Graph Convolutional Networks (GCN), has achieved remarkable results. However, due to their deep structure and reliance on continuous floating-point operations, GCN-based methods are energy-intensive. To address this issue, we propose an innovative Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Disti… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  43. arXiv:2404.09833  [pdf, other

    cs.CV cs.AI

    Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

    Authors: Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, Shenlong Wang

    Abstract: Creating high-quality and interactive virtual environments, such as games and simulators, often involves complex and costly manual modeling processes. In this paper, we present Video2Game, a novel approach that automatically converts videos of real-world scenes into realistic and interactive game environments. At the heart of our system are three core components:(i) a neural radiance fields (NeRF)… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project page (with code): https://video2game.github.io/

  44. arXiv:2404.09115  [pdf, other

    cs.CV

    GCC: Generative Calibration Clustering

    Authors: Haifeng Xia, Hai Huang, Zhengming Ding

    Abstract: Deep clustering as an important branch of unsupervised representation learning focuses on embedding semantically similar samples into the identical feature space. This core demand inspires the exploration of contrastive learning and subspace clustering. However, these solutions always rely on the basic assumption that there are sufficient and category-balanced samples for generating valid high-lev… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  45. arXiv:2404.04609  [pdf, other

    astro-ph.HE astro-ph.GA

    The particle acceleration study in blazar jet

    Authors: Hubing Xiao, Wenxin Yang, Yutao Zhang, Shaohua Zhang, Junhui Fan, Li** Fu, Jianghe Yang

    Abstract: The particle acceleration of blazar jets is crucial to high-energy astrophysics, yet the acceleration mechanism division in blazar subclasses and the underlying nature of these mechanisms remain elusive. In this work, we utilized the synchrotron spectral information (synchrotron peak frequency, $\log ν_{\rm sy}$, and corresponding curvature, $b_{\rm sy}$) of 2705 blazars from the literature and st… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Accepted to ApJ

  46. arXiv:2404.04050  [pdf, other

    cs.CV

    No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation

    Authors: Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyu Guo, Jiaming Liu, Han Xiao, Chaoyou Fu, Hao Dong, Peng Gao

    Abstract: To reduce the reliance on large-scale datasets, recent works in 3D segmentation resort to few-shot learning. Current 3D few-shot segmentation methods first pre-train models on 'seen' classes, and then evaluate their generalization performance on 'unseen' classes. However, the prior pre-training stage not only introduces excessive time overhead but also incurs a significant domain gap on 'unseen' c… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: CVPR Highlight. Code is available at https://github.com/yangyangyang127/Seg-NN. arXiv admin note: text overlap with arXiv:2308.12961

  47. Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation

    Authors: Hui Xiao, Yuting Hong, Li Dong, Diqun Yan, Jiayan Zhuang, Junjie Xiong, Dongtai Liang, Chengbin Peng

    Abstract: Semi-supervised semantic segmentation relieves the reliance on large-scale labeled data by leveraging unlabeled data. Recent semi-supervised semantic segmentation approaches mainly resort to pseudo-labeling methods to exploit unlabeled data. However, unreliable pseudo-labeling can undermine the semi-supervision processes. In this paper, we propose an algorithm called Multi-Level Label Correction (… ▽ More

    Submitted 9 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 12 pages, 8 figures. IEEE Transactions on Multimedia, 2024

  48. arXiv:2404.01843  [pdf, other

    cs.CV

    Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation

    Authors: Wangguandong Zheng, Haifeng Xia, Rui Chen, Ming Shao, Siyu Xia, Zhengming Ding

    Abstract: Recently, image-to-3D approaches have achieved significant results with a natural image as input. However, it is not always possible to access these enriched color input samples in practical applications, where only sketches are available. Existing sketch-to-3D researches suffer from limitations in broad applications due to the challenges of lacking color information and multi-view content. To ove… ▽ More

    Submitted 7 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  49. arXiv:2403.19430  [pdf, other

    cond-mat.mes-hall quant-ph

    Coexistence of non-Hermitian skin effect and extended states in one-dimensional nonreciprocal lattices

    Authors: Han Xiao, Qi-Bo Zeng

    Abstract: We study the one-dimensional non-Hermitian lattices with staggered onsite modulations and nonreciprocal hop** up to the next-nearest-neighboring (NNN) sites. Due to the NNN nonreciprocity, the non-Hermitian skin effect (NHSE) in the system under open boundary conditions (OBC) can be energy-dependent, and there will be NHSE edges in the eigenenergy spectrum, which separates the eigenstates locali… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 7 pages, 6 figures

  50. arXiv:2403.17507  [pdf, other

    cs.LG physics.chem-ph

    EL-MLFFs: Ensemble Learning of Machine Leaning Force Fields

    Authors: Bangchen Yin, Yue Yin, Yuda W. Tang, Hai Xiao

    Abstract: Machine learning force fields (MLFFs) have emerged as a promising approach to bridge the accuracy of quantum mechanical methods and the efficiency of classical force fields. However, the abundance of MLFF models and the challenge of accurately predicting atomic forces pose significant obstacles in their practical application. In this paper, we propose a novel ensemble learning framework, EL-MLFFs,… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures