Skip to main content

Showing 1–50 of 9,782 results for author: Fang

.
  1. arXiv:2407.09274  [pdf, other

    cs.LG cs.AI q-bio.BM

    Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX

    Authors: Zhiyuan Chen, Tianhao Chen, Chenggang Xie, Yang Xue, Xiaonan Zhang, **gbo Zhou, Xiaomin Fang

    Abstract: Proteins are fundamental components of biological systems and can be represented through various modalities, including sequences, structures, and textual descriptions. Despite the advances in deep learning and scientific large language models (LLMs) for protein research, current methodologies predominantly focus on limited specialized tasks -- often predicting one protein modality from another. Th… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.08926  [pdf, other

    cs.IR

    Toward Automatic Group Membership Annotation for Group Fairness Evaluation

    Authors: Fumian Chen, Dayu Yang, Hui Fang

    Abstract: With the increasing research attention on fairness in information retrieval systems, more and more fairness-aware algorithms have been proposed to ensure fairness for a sustainable and healthy retrieval ecosystem. However, as the most adopted measurement of fairness-aware algorithms, group fairness evaluation metrics, require group membership information that needs massive human annotations and is… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Journal ref: NLDB2024

  3. arXiv:2407.08920  [pdf, other

    cond-mat.mes-hall cond-mat.str-el

    Chern Bands' Optimally Localized Wannier Functions and Fractional Chern Insulators

    Authors: Fang Xie, Yuan Fang, Lei Chen, Jennifer Cano, Qimiao Si

    Abstract: Recent development on fractional Chern insulators and proximate phases call for a real space representation of isolated Chern bands. Here we propose a new method for a general construction of optimally localized Wannier functions from such Chern bands. We do so through an optimal gauge choice of the Bloch states of a Chern band with the singularity placed at any desired position in momentum space.… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 7+29 pages, 3+14 figures

  4. arXiv:2407.08849  [pdf, other

    astro-ph.HE hep-ex

    TeV Analysis of a Source Rich Region with HAWC Observatory: Is HESS J1809-193 a Potential Hadronic PeVatron?

    Authors: A. Albert, R. Alfaro, C. Alvarez, J. C. Arteaga-Velázquez, D. Avila Rojas, R. Babu, E. Belmont-Moreno, A. Bernal, M. Breuhaus, K. S. Caballero-Mora, T. Capistrán, A. Carramiñana, S. Casanova, J. Cotzomi, E. De la Fuente, D. Depaoli, N. Di Lalla, R. Diaz Hernandez, B. L. Dingus, M. A. DuVernois, C. Espinoza, K. L. Fan, K. Fang, B. Fick, N. Fraija , et al. (57 additional authors not shown)

    Abstract: HESS J1809-193 is an unidentified TeV source, first detected by the High Energy Stereoscopic System (H.E.S.S.) Collaboration. The emission originates in a source-rich region that includes several Supernova Remnants (SNR) and Pulsars (PSR) including SNR G11.1+0.1, SNR G11.0-0.0, and the young radio pulsar J1809-1917. Originally classified as a pulsar wind nebula (PWN) candidate, recent studies show… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  5. arXiv:2407.08813  [pdf, other

    eess.IV cs.AI cs.CV

    FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification

    Authors: Yu Tian, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang

    Abstract: Addressing fairness in artificial intelligence (AI), particularly in medical AI, is crucial for ensuring equitable healthcare outcomes. Recent efforts to enhance fairness have introduced new methodologies and datasets in medical AI. However, the fairness issue under the setting of domain transfer is almost unexplored, while it is common that clinics rely on different imaging technologies (e.g., di… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ECCV 2024; Codes available at https://github.com/Harvard-Ophthalmology-AI-Lab/FairDomain

  6. arXiv:2407.08517  [pdf, other

    cs.CV

    Generalized Low-Rank Matrix Completion Model with Overlap** Group Error Representation

    Authors: Wen**g Lu, Zhuang Fang, Liang Wu, Liming Tang, Hanxin Liu

    Abstract: The low-rank matrix completion (LRMC) technology has achieved remarkable results in low-level visual tasks. There is an underlying assumption that the real-world matrix data is low-rank in LRMC. However, the real matrix data does not satisfy the strict low-rank property, which undoubtedly present serious challenges for the above-mentioned matrix recovery methods. Fortunately, there are feasible sc… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  7. arXiv:2407.08498  [pdf, other

    cs.CV eess.IV

    ERD: Exponential Retinex decomposition based on weak space and hybrid nonconvex regularization and its denoising application

    Authors: Wen**g Lu, Liang Wu, Liming Tang, Zhuang Fang

    Abstract: The Retinex theory models the image as a product of illumination and reflection components, which has received extensive attention and is widely used in image enhancement, segmentation and color restoration. However, it has been rarely used in additive noise removal due to the inclusion of both multiplication and addition operations in the Retinex noisy image modeling. In this paper, we propose an… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  8. arXiv:2407.08348  [pdf, other

    cs.AI cs.CL cs.LG

    Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

    Authors: Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou

    Abstract: In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  9. arXiv:2407.08273   

    cs.CL

    RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL

    Authors: Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song

    Abstract: Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task. Previous works generally focus on using exclusive SQL generation prompt to improve the LLMs' reasoning ability. However, they are mostly hard to handle large databases with numerous tables and columns, and usually ignore the significance of pre-processing database and extracting v… ▽ More

    Submitted 12 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: Further improvement and modification are needed.

  10. arXiv:2407.08255  [pdf, other

    cs.CV cs.LG

    GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification

    Authors: Aitao Yang, Min Li, Yao Ding, Leyuan Fang, Yaoming Cai, Yujie He

    Abstract: Efficient extraction of spectral sequences and geospatial information has always been a hot topic in hyperspectral image classification. In terms of spectral sequence feature capture, RNN and Transformer have become mainstream classification frameworks due to their long-range feature capture capabilities. In terms of spatial information aggregation, CNN enhances the receptive field to retain integ… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages, 10 figures

  11. arXiv:2407.08120  [pdf, other

    astro-ph.GA

    Spectroastrometry and Reverberation Map** (SARM) of Active Galactic Nuclei. I. The H$β$ Broad-line Region Structure and Black Hole Mass of Five Quasars

    Authors: Yan-Rong Li, Chen Hu, Zhu-Heng Yao, Yong-Jie Chen, Hua-Rui Bai, Sen Yang, Pu Du, Feng-Na Fang, Yi-Xin Fu, Jun-Rong Liu, Yue-Chang Peng, Yu-Yang Songsheng, Yi-Lin Wang, Ming Xiao, Shuo Zhai, Hartmut Winkler, **-Ming Bai, Luis C. Ho, Romain G. Petrov, Jesus Aceituno, Jian-Min Wang

    Abstract: We conduct a reverberation map** (RM) campaign to spectroscopically monitor a sample of selected bright active galactic nuclei with large anticipated broad-line region (BLR) sizes adequate for spectroastrometric observations by the GRAVITY instrument on the Very Large Telescope Interferometer. We report the first results for five objects, IC 4329A, Mrk 335, Mrk 509, Mrk 1239, and PDS 456, among… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 32 pages, 6 tables, 20 figures. To appear in ApJ

  12. On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments

    Authors: **gchao Fang, Nikos Arechiga, Keiichi Namaoshi, Nayeli Bravo, Candice Hogan, David A. Shamma

    Abstract: The Wizard of Oz (WoZ) method is a widely adopted research approach where a human Wizard ``role-plays'' a not readily available technology and interacts with participants to elicit user behaviors and probe the design space. With the growing ability for modern large language models (LLMs) to role-play, one can apply LLMs as Wizards in WoZ experiments with better scalability and lower cost than the… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: To be published in ACM IVA 2024

    ACM Class: H.5.m; I.2.7

    Journal ref: ACM International Conference on Intelligent Virtual Agents (IVA 2024), September 16-19, 2024, Glasgow, United Kingdom. ACM, New York, NY, USA

  13. arXiv:2407.07959  [pdf, other

    cs.SE cs.AI

    Source Code Summarization in the Era of Large Language Models

    Authors: Weisong Sun, Yun Miao, Yuekang Li, Hongyu Zhang, Chunrong Fang, Yi Liu, Gelei Deng, Yang Liu, Zhenyu Chen

    Abstract: To support software developers in understanding and maintaining programs, various automatic (source) code summarization techniques have been proposed to generate a concise natural language summary (i.e., comment) for a given code snippet. Recently, the emergence of large language models (LLMs) has led to a great boost in the performance of code-related tasks. In this paper, we undertake a systemat… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Just accepted to the 47th International Conference on Software Engineering (ICSE 2025)

    MSC Class: 68-04 ACM Class: D.2.3; I.2.7

  14. arXiv:2407.07930  [pdf

    q-bio.BM cs.LG

    Token-Mol 1.0: Tokenized drug design with large language model

    Authors: Jike Wang, Rui Qin, Mingyang Wang, Mei**g Fang, Yangyang Zhang, Yuchen Zhu, Qun Su, Qiaolin Gou, Chao Shen, Odin Zhang, Zhenxing Wu, Dejun Jiang, Xujun Zhang, Huifeng Zhao, Xiaozhe Wan, Zhourui Wu, Liwei Liu, Yu Kang, Chang-Yu Hsieh, Tingjun Hou

    Abstract: Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  15. arXiv:2407.07666  [pdf

    cs.CL cs.AI

    A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability

    Authors: Ting Fang Tan, Kabilan Elangovan, Jasmine Ong, Nigam Shah, Joseph Sung, Tien Yin Wong, Lan Xue, Nan Liu, Haibo Wang, Chang Fu Kuo, Simon Chesterman, Zee Kin Yeong, Daniel SW Ting

    Abstract: A comprehensive qualitative evaluation framework for large language models (LLM) in healthcare that expands beyond traditional accuracy and quantitative metrics needed. We propose 5 key aspects for evaluation of LLMs: Safety, Consensus, Objectivity, Reproducibility and Explainability (S.C.O.R.E.). We suggest that S.C.O.R.E. may form the basis for an evaluation framework for future LLM-based models… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  16. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  17. arXiv:2407.07427  [pdf, other

    cs.CV

    Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation

    Authors: Hao Fang, Peng Wu, Yawei Li, Xinxin Zhang, Xiankai Lu

    Abstract: Open-Vocabulary Video Instance Segmentation (VIS) is attracting increasing attention due to its ability to segment and track arbitrary objects. However, the recent Open-Vocabulary VIS attempts obtained unsatisfactory results, especially in terms of generalization ability of novel categories. We discover that the domain gap between the VLM features (e.g., CLIP) and the instance queries and the unde… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  18. arXiv:2407.07221  [pdf, other

    cs.CV cs.CR

    Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning

    Authors: Yuqi Jia, Minghong Fang, Hongbin Liu, **ghuai Zhang, Neil Zhenqiang Gong

    Abstract: Poisoning attacks compromise the training phase of federated learning (FL) such that the learned global model misclassifies attacker-chosen inputs called target inputs. Existing defenses mainly focus on protecting the training phase of FL such that the learnt global model is poison free. However, these defenses often achieve limited effectiveness when the clients' local training data is highly non… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  19. arXiv:2407.06937  [pdf, other

    cs.CV

    HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

    Authors: Guian Fang, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang Xu, Shengcai Liao, Xiaodan Liang

    Abstract: Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the firs… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  20. arXiv:2407.06842  [pdf, other

    cs.CV

    Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts

    Authors: Shuangkang Fang, Yufeng Wang, Yi-Hsuan Tsai, Yi Yang, Wenrui Ding, Shuchang Zhou, Ming-Hsuan Yang

    Abstract: Recent work on image content manipulation based on vision-language pre-training models has been effectively extended to text-driven 3D scene editing. However, existing schemes for 3D scene editing still exhibit certain shortcomings, hindering their further interactive design. Such schemes typically adhere to fixed input patterns, limiting users' flexibility in text input. Moreover, their editing c… ▽ More

    Submitted 9 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024; Project Website: https://sk-fun.fun/CE3D

  21. arXiv:2407.06459  [pdf, other

    cs.RO cs.AI cs.LG

    How Much Progress Did I Make? An Unexplored Human Feedback Signal for Teaching Robots

    Authors: Hang Yu, Qidi Fang, Shijie Fang, Reuben M. Aronson, Elaine Schaertl Short

    Abstract: Enhancing the expressiveness of human teaching is vital for both improving robots' learning from humans and the human-teaching-robot experience. In this work, we characterize and test a little-used teaching signal: \textit{progress}, designed to represent the completion percentage of a task. We conducted two online studies with 76 crowd-sourced participants and one public space study with 40 non-e… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 8 pages. RO-MAN 2024

  22. arXiv:2407.06317  [pdf, other

    cs.AI cs.CV cs.RO

    Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation

    Authors: Jianuo Huang, Zhenlong Fang

    Abstract: With the advancement of autonomous driving, ensuring safety during motion planning and navigation is becoming more and more important. However, most end-to-end planning methods suffer from a lack of safety. This research addresses the safety issue in the control optimization problem of autonomous driving, formulated as Constrained Markov Decision Processes (CMDPs). We propose a novel, model-based… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  23. arXiv:2407.06304  [pdf, other

    cs.CV cs.AI cs.CL

    VIMI: Grounding Video Generation through Multi-modal Instruction

    Authors: Yuwei Fang, Willi Menapace, Aliaksandr Siarohin, Tsai-Shien Chen, Kuan-Chien Wang, Ivan Skorokhodov, Graham Neubig, Sergey Tulyakov

    Abstract: Existing text-to-video diffusion models rely solely on text-only encoders for their pretraining. This limitation stems from the absence of large-scale multimodal prompt video datasets, resulting in a lack of visual grounding and restricting their versatility and application in multimodal integration. To address this, we construct a large-scale multimodal prompt dataset by employing retrieval metho… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  24. arXiv:2407.06045  [pdf, other

    cs.CV

    OpenCIL: Benchmarking Out-of-Distribution Detection in Class-Incremental Learning

    Authors: Wenjun Miao, Guansong Pang, Trong-Tung Nguyen, Ruohang Fang, ** Zheng, Xiao Bai

    Abstract: Class incremental learning (CIL) aims to learn a model that can not only incrementally accommodate new classes, but also maintain the learned knowledge of old classes. Out-of-distribution (OOD) detection in CIL is to retain this incremental learning ability, while being able to reject unknown samples that are drawn from different distributions of the learned classes. This capability is crucial to… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  25. arXiv:2407.06027  [pdf, other

    cs.CL

    PAS: Data-Efficient Plug-and-Play Prompt Augmentation System

    Authors: Miao Zheng, Hao Liang, Fan Yang, Haoze Sun, Tianpeng Li, Lingchu Xiong, Yan Zhang, Youzhen Wu, Kun Li, Yanjun Shen, Mingan Lin, Tao Zhang, Guosheng Dong, Yu**g Qiao, Kun Fang, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

    Abstract: In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among the various AI techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficul… ▽ More

    Submitted 12 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  26. arXiv:2407.05928  [pdf, other

    eess.SP

    CA-FedRC: Codebook Adaptation via Federated Reservoir Computing in 5G NR

    Authors: Ziqiang Ye, Sikai Liao, Yulan Gao, Shu Fang, Yue Xiao, Ming Xiao, Saviour Zammit

    Abstract: With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback ove… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  27. arXiv:2407.05869  [pdf, other

    cs.AI

    PORCA: Root Cause Analysis with Partially Observed Data

    Authors: Chang Gong, Di Yao, ** Wang, Wenbin Li, Lanting Fang, Yongtao Xie, Kaiyu Feng, Peng Han, **g** Bi

    Abstract: Root Cause Analysis (RCA) aims at identifying the underlying causes of system faults by uncovering and analyzing the causal structure from complex systems. It has been widely used in many application domains. Reliable diagnostic conclusions are of great importance in mitigating system failures and financial losses. However, previous studies implicitly assume a full observation of the system, which… ▽ More

    Submitted 11 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  28. arXiv:2407.05552  [pdf, other

    cs.CV

    Ada-adapter:Fast Few-shot Style Personlization of Diffusion Model with Pre-trained Image Encoder

    Authors: Jia Liu, Changlin Li, Qirui Sun, Jiahui Ming, Chen Fang, Jue Wang, Bing Zeng, Shuaicheng Liu

    Abstract: Fine-tuning advanced diffusion models for high-quality image stylization usually requires large training datasets and substantial computational resources, hindering their practical applicability. We propose Ada-Adapter, a novel framework for few-shot style personalization of diffusion models. Ada-Adapter leverages off-the-shelf diffusion models and pre-trained image feature encoders to learn a com… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 16 pages, 11 figures

    MSC Class: 68T07 ACM Class: I.4.0

  29. arXiv:2407.05395  [pdf, other

    nucl-th

    Quantifying angular distributions in multinucleon transfer reactions with a semi-classical method

    Authors: Zehong Liao, Zepeng Gao, Yu Yang, Yue** Fang, Jun Su, Long Zhu

    Abstract: The multinucleon transfer (MNT) process in low-energy heavy ion collisions can be utilized to produce unknown nuclei far beyond the stability line. However, the reaction products exhibit broad angular and energy distributions, which could lower the experimental detection efficiency. We present a classical approach that employs a parameterized angular distribution to describe the complex issue. By… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 6 pages, 6 figure

  30. arXiv:2407.05331  [pdf, ps, other

    eess.SY

    Channel Characterization of IRS-assisted Resonant Beam Communication Systems

    Authors: Wen Fang, Wen Chen, Qingqing Wu, Xusheng Zhu, Qiong Wu, Nan Cheng

    Abstract: To meet the growing demand for data traffic, spectrum-rich optical wireless communication (OWC) has emerged as a key technological driver for the development of 6G. The resonant beam communication (RBC) system, which employs spatially separated laser cavities as the transmitter and receiver, is a high-speed OWC technology capable of self-alignment without tracking. However, its transmission throug… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  31. arXiv:2407.05204  [pdf, other

    cond-mat.mtrl-sci

    Leveraging Persistent Homology Features for Accurate Defect Formation Energy Predictions via Graph Neural Networks

    Authors: Zhenyao Fang, Qimin Yan

    Abstract: Accurate predictions of defect formation energies are crucial to understanding defect properties in materials, such as favorable defect types and concentration, and to investigating the interplay between defects and other functional properties of materials. To overcome the computational expense of density-functional theory calculations in large supercells, several graph neural network (GNN) models… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 30 pages, 4 figures

  32. arXiv:2407.04923  [pdf, other

    cs.CV cs.CL

    OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

    Authors: Tiancheng Zhao, Qianqian Zhang, Kyusong Lee, Peng Liu, Lu Zhang, Chunxin Fang, Jiajia Liao, Kelei Jiang, Yibo Ma, Ruochen Xu

    Abstract: We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an ac… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 14 pages

  33. arXiv:2407.04616  [pdf, other

    cs.CV cs.AI cs.LG

    Isomorphic Pruning for Vision Models

    Authors: Gongfan Fang, Xinyin Ma, Michael Bi Mi, Xinchao Wang

    Abstract: Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures. However, assessing the relative importance of different sub-structures remains a significant challenge, particularly in advanced vision models featuring novel mechanisms and architectures like self-attention, depth-wise convolutions, or residual connections. These heterogeneous subst… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  34. arXiv:2407.04504  [pdf, other

    cs.CV

    Segment Any 4D Gaussians

    Authors: Shengxiang Ji, Guanjun Wu, Jiemin Fang, Jiazhong Cen, Taoran Yi, Wenyu Liu, Qi Tian, Xinggang Wang

    Abstract: Modeling, understanding, and reconstructing the real world are crucial in XR/VR. Recently, 3D Gaussian Splatting (3D-GS) methods have shown remarkable success in modeling and understanding 3D scenes. Similarly, various 4D representations have demonstrated the ability to capture the dynamics of the 4D world. However, there is a dearth of research focusing on segmentation within 4D representations.… ▽ More

    Submitted 12 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: 22 pages

  35. arXiv:2407.04451  [pdf, other

    cs.LG cs.AI

    Hindsight Preference Learning for Offline Preference-based Reinforcement Learning

    Authors: Chen-Xiao Gao, Shengjun Fang, Chenjun Xiao, Yang Yu, Zongzhang Zhang

    Abstract: Offline preference-based reinforcement learning (RL), which focuses on optimizing policies using human preferences between pairs of trajectory segments selected from an offline dataset, has emerged as a practical avenue for RL applications. Existing works rely on extracting step-wise reward signals from trajectory-wise preference annotations, assuming that preferences correlate with the cumulative… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  36. arXiv:2407.04418  [pdf, other

    cs.HC cs.AI cs.LG

    Enabling On-Device LLMs Personalization with Smartphone Sensing

    Authors: Shiquan Zhang, Ying Ma, Le Fang, Hong Jia, Simon D'Alfonso, Vassilis Kostakos

    Abstract: This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud-based LLMs, such as privacy concerns, latency and cost, and limited personal sensor data. To achieve this, we innovati… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 5 pages, 3 figures, conference demo paper

  37. arXiv:2407.04285  [pdf, other

    cs.LG cs.AI

    Robust Decision Transformer: Tackling Data Corruption in Offline RL via Sequence Modeling

    Authors: Jiawei Xu, Rui Yang, Feng Luo, Meng Fang, Baoxiang Wang, Lei Han

    Abstract: Learning policies from offline datasets through offline reinforcement learning (RL) holds promise for scaling data-driven decision-making and avoiding unsafe and costly online interactions. However, real-world data collected from sensors or humans often contains noise and errors, posing a significant challenge for existing offline RL methods. Our study indicates that traditional offline RL methods… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  38. arXiv:2407.04208  [pdf, other

    cs.CV

    AMD: Automatic Multi-step Distillation of Large-scale Vision Models

    Authors: Cheng Han, Qifan Wang, Sohail A. Dianat, Majid Rabbani, Raghuveer M. Rao, Yi Fang, Qiang Guan, Lifu Huang, Dongfang Liu

    Abstract: Transformer-based architectures have become the de-facto standard models for diverse vision tasks owing to their superior performance. As the size of the models continues to scale up, model distillation becomes extremely important in various real applications, particularly on devices limited by computational resources. However, prevailing knowledge distillation methods exhibit diminished efficacy… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 19 pages, 5 figures

  39. arXiv:2407.03682  [pdf, other

    astro-ph.HE

    Observation of the Galactic Center PeVatron Beyond 100 TeV with HAWC

    Authors: A. Albert, R. Alfaro, C. Alvarez, A. Andrés, J. C. Arteaga-Velázquez, D. Avila Rojas, H. A. Ayala Solares, R. Babu, E. Belmont-Moreno, A. Bernal, K. S. Caballero-Mora, T. Capistrán, A. Carramiñana, S. Casanova, U. Cotti, J. Cotzomi, S. Coutiño de León, E. De la Fuente, C. de León, D. Depaoli, N. Di Lalla, N. Di Lalla, R. Diaz Hernandez, B. L. Dingus, M. A. DuVernois , et al. (78 additional authors not shown)

    Abstract: We report an observation of ultra-high energy (UHE) gamma rays from the Galactic Center region, using seven years of data collected by the High-Altitude Water Cherenkov (HAWC) Observatory. The HAWC data are best described as a point-like source (HAWC J1746-2856) with a power-law spectrum ($\mathrm{d}N/\mathrm{d}E=φ(E/26 \,\text{TeV})^γ$), where $γ=-2.88 \pm 0.15_{\text{stat}} - 0.1_{\text{sys}} $… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  40. arXiv:2407.03578  [pdf, other

    math.OC

    Distributed online generalized Nash Equilibrium learning in multi-cluster games: A delay-tolerant algorithm

    Authors: Bingqian Liu, Guanghui Wen, Xiao Fang, Tingwen Huang, Guanrong Chen

    Abstract: This paper addresses the problem of distributed online generalized Nash equilibrium (GNE) learning for multi-cluster games with delayed feedback information. Specifically, each agent in the game is assumed to be informed a sequence of local cost functions and constraint functions, which are known to the agent with time-varying delays subsequent to decision-making at each round. The objective of ea… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  41. arXiv:2407.03542  [pdf

    eess.IV cs.CV cs.LG

    Probing Perfection: The Relentless Art of Meddling for Pulmonary Airway Segmentation from HRCT via a Human-AI Collaboration Based Active Learning Method

    Authors: Shiyi Wang, Yang Nan, Sheng Zhang, Federico Felder, Xiaodan Xing, Yingying Fang, Javier Del Ser, Simon L F Walsh, Guang Yang

    Abstract: In pulmonary tracheal segmentation, the scarcity of annotated data is a prevalent issue in medical segmentation. Additionally, Deep Learning (DL) methods face challenges: the opacity of 'black box' models and the need for performance enhancement. Our Human-Computer Interaction (HCI) based models (RS_UNet, LC_UNet, UUNet, and WD_UNet) address these challenges by combining diverse query strategies w… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  42. arXiv:2407.03316  [pdf, other

    nucl-ex hep-ex

    An Upper Limit on the Photoproduction Cross Section of the Spin-Exotic $π_1(1600)$

    Authors: F. Afzal, C. S. Akondi, M. Albrecht, M. Amaryan, S. Arrigo, V. Arroyave, A. Asaturyan, A. Austregesilo, Z. Baldwin, F. Barbosa, J. Barlow, E. Barriga, R. Barsotti, D. Barton, V. Baturin, V. V. Berdnikov, T. Black, W. Boeglin, M. Boer, W. J. Briscoe, T. Britton, S. Cao, E. Chudakov, G. Chung, P. L. Cole , et al. (124 additional authors not shown)

    Abstract: The spin-exotic hybrid meson $π_{1}(1600)$ is predicted to have a large decay rate to the $ωππ$ final state. Using 76.6~pb$^{-1}$ of data collected with the GlueX detector, we measure the cross sections for the reactions $γp \to ωπ^+ π^- p$, $γp \to ωπ^0 π^0 p$, and $γp\toωπ^-π^0Δ^{++}$ in the range $E_γ=$ 8-10 GeV. Using isospin conservation, we set the first upper limits on the photoproduction c… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 6 pages, 3 figures plus supplemental materials

  43. arXiv:2407.03063  [pdf, other

    cs.HC

    ScreenTK: Seamless Detection of Time-Killing Moments Using Continuous Mobile Screen Text and on-device LLM

    Authors: Le Fang, Shiquan Zhang, Hong Jia, Jorge Goncalves, Vassilis Kostakos

    Abstract: Smartphones have become essential to people's digital lives, providing a continuous stream of information and connectivity. However, this constant flow can lead to moments where users are simply passing time rather than engaging meaningfully. This underscores the importance of develo** methods to identify these "time-killing" moments, enabling the delivery of important notifications in a way tha… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

  44. arXiv:2407.03026  [pdf, other

    cs.SD cs.AI eess.AS

    Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition

    Authors: **ming Chen, **gyi Fang, Yuanzhong Zheng, Yaoxuan Wang, Haojun Fei

    Abstract: Currently, end-to-end (E2E) speech recognition methods have achieved promising performance. However, auto speech recognition (ASR) models still face challenges in recognizing multi-accent speech accurately. We propose a layer-adapted fusion (LAF) model, called Qifusion-Net, which does not require any prior knowledge about the target accent. Based on dynamic chunk strategy, our approach enables str… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: accpeted by interspeech 2014, 5 pages, 1 figure

  45. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  46. arXiv:2407.02830  [pdf, other

    cs.CV eess.IV

    A Radiometric Correction based Optical Modeling Approach to Removing Reflection Noise in TLS Point Clouds of Urban Scenes

    Authors: Li Fang, Tianyu Li, Yanghong Lin, Shudong Zhou, Wei Yao

    Abstract: Point clouds are vital in computer vision tasks such as 3D reconstruction, autonomous driving, and robotics. However, TLS-acquired point clouds often contain virtual points from reflective surfaces, causing disruptions. This study presents a reflection noise elimination algorithm for TLS point clouds. Our innovative reflection plane detection algorithm, based on geometry-optical models and physica… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  47. arXiv:2407.02795  [pdf, ps, other

    nucl-th hep-ph

    Nuclear shell model study of neutrinoless double beta decay under Left-Right symmetric model

    Authors: Dong-Liang Fang, B. Alex Brown, Fedor Šimkovic

    Abstract: We use the large scale nuclear shell model to calculate the nuclear matrix elements for the neutrino mediated neutrinoless double beta decay within the Left-Right symmetric model for four nuclei: $^{76}$Ge, $^{82}$Se, $^{130}$Te and $^{136}$Xe. We perform a systematic analysis on the general magnitude of different terms for related mechanisms. For the $η$ mechanism, we find that the weak magnetism… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 10 pages, 2 figures

  48. arXiv:2407.02783  [pdf, ps, other

    cs.CL cs.AI

    52B to 1T: Lessons Learned via Tele-FLM Series

    Authors: Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

    Abstract: Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: For the Tele-FLM-52B tech report, see also 2404.16645

  49. arXiv:2407.02600  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Macroscopic uniform 2D moiré superlattices with controllable angles

    Authors: Gregory Zaborski Jr., Paulina E. Majchrzak, Samuel Lai, Amalya C. Johnson, Ashley P. Saunders, Ziyan Zhu, Yujun Deng, Donghui Lu, Makoto Hashimoto, Z-X Shen, Fang Liu

    Abstract: Moiré superlattices, engineered through precise stacking of van der Waals (vdW) layers, hold immense promise for exploring strongly correlated and topological phenomena. However, these applications have been held back by the common preparation method: tear-and-stack of Scotch tape exfoliated monolayers. It has low efficiency and reproducibility, along with challenges of twist angle inhomogeneity,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 16 pages, 4 figures

  50. arXiv:2407.02522  [pdf, other

    cs.SE

    Systematic literature review of the trust reinforcement mechanisms exist in package ecosystems

    Authors: Angel Temelko, Fang Hou, Siamak Farshidi, Slinger Jansen

    Abstract: We conducted a thorough SLR to better grasp the challenges and possible solutions associated with existing npm security tools. Our goal was to delve into documented experiences and findings. Specifically, we were keen to learn about the motivations behind choosing third-party packages, software engineers' responses to warning messages, and their overall understanding of security issues. The main a… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.