Skip to main content

Showing 51–100 of 8,503 results for author: Zhang, S

.
  1. arXiv:2406.15734  [pdf, other

    cs.CL cs.AI

    RankAdaptor: Hierarchical Dynamic Low-Rank Adaptation for Structural Pruned LLMs

    Authors: Changhai Zhou, Shijie Han, Shiyang Zhang, Shichao Weng, Zekai Liu, Cheng **

    Abstract: The efficient compression of large language models (LLMs) is becoming increasingly popular. However, recovering the accuracy of compressed LLMs is still a major challenge. Structural pruning with standard Low-Rank Adaptation (LoRA) is a common technique in current LLM compression. In structural pruning, the model architecture is modified unevenly, resulting in suboptimal performance in various dow… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.15501  [pdf

    cs.CR

    Secure Combination of Untrusted Time information Based on Optimized Dempster-Shafer Theory

    Authors: Yang Li, Yujie Luo, Yichen Zhang, Ao Sun, Wei Huang, Shuai Zhang, Tao Zhang, Chuang Zhou, Li Ma, Jie Yang, Mei Wu, Heng Wang, Yan Pan, Yun Shao, Xing Chen, Ziyang Chen, Song Yu, Hong Guo, Bingjie Xu

    Abstract: Secure precision time synchronization is important for applications of Cyber-Physical Systems. However, several attacks, especially the Time Delay Attack (TDA), deteriorates the performance of time synchronization system seriously. Multiple paths scheme is thought as an effective security countermeasure to decrease the influence of TDA. However, the effective secure combination algorithm is still… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.15471  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Large Models with Small models: Lower Costs and Better Performance

    Authors: Dong Chen, Shuo Zhang, Yueting Zhuang, Siliang Tang, Qidong Liu, Hua Wang, Mingliang Xu

    Abstract: Pretrained large models (PLMs), such as ChatGPT, have demonstrated remarkable performance across diverse tasks. However, the significant computational requirements of PLMs have discouraged most product teams from running or fine-tuning them. In such cases, to harness the exceptional performance of PLMs, one must rely on expensive APIs, thereby exacerbating the economic burden. Despite the overall… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 11 pages

  4. arXiv:2406.15047  [pdf, other

    cs.IT eess.SP

    Optimal Transmit Signal Design for Multi-Target MIMO Sensing Exploiting Prior Information

    Authors: Jiayi Yao, Shuowen Zhang

    Abstract: In this paper, we study the transmit signal optimization in a multiple-input multiple-output (MIMO) radar system for sensing the angle information of multiple targets via their reflected echo signals. We consider a challenging and practical scenario where the angles to be sensed are unknown and random, while their probability information is known a priori for exploitation. First, we establish an a… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: submitted for possible piblication

  5. arXiv:2406.15030  [pdf, ps, other

    hep-ex

    Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  6. arXiv:2406.15019  [pdf, other

    cs.CL

    MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens

    Authors: Yongqi Fan, Hongli Sun, Kui Xue, Xiaofan Zhang, Shaoting Zhang, Tong Ruan

    Abstract: Numerous advanced Large Language Models (LLMs) now support context lengths up to 128K, and some extend to 200K. Some benchmarks in the generic domain have also followed up on evaluating long-context capabilities. In the medical domain, tasks are distinctive due to the unique contexts and need for domain expertise, necessitating further evaluation. However, despite the frequent presence of long tex… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  7. arXiv:2406.15000  [pdf, other

    cs.CL cs.AI

    Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations

    Authors: Lichao Zhang, Jia Yu, Shuai Zhang, Long Li, Yangyang Zhong, Guanbao Liang, Yuming Yan, Qing Ma, Fangsheng Weng, Fayu Pan, **g Li, Renjun Xu, Zhenzhong Lan

    Abstract: Large Language Models (LLMs) have significantly advanced user-bot interactions, enabling more complex and coherent dialogues. However, the prevalent text-only modality might not fully exploit the potential for effective user engagement. This paper explores the impact of multi-modal interactions, which incorporate images and audio alongside text, on user engagement in chatbot conversations. We cond… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  8. arXiv:2406.14910  [pdf, ps, other

    cs.LG cs.DC math.OC

    Towards Dynamic Resource Allocation and Client Scheduling in Hierarchical Federated Learning: A Two-Phase Deep Reinforcement Learning Approach

    Authors: Xiao**g Chen, Zhenyuan Li, Wei Ni, Xin Wang, Shunqing Zhang, Yanzan Sun, Shugong Xu, Qingqi Pei

    Abstract: Federated learning (FL) is a viable technique to train a shared machine learning model without sharing data. Hierarchical FL (HFL) system has yet to be studied regrading its multiple levels of energy, computation, communication, and client scheduling, especially when it comes to clients relying on energy harvesting to power their operations. This paper presents a new two-phase deep deterministic p… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  9. arXiv:2406.14891  [pdf, other

    cs.CL cs.IR

    Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering

    Authors: Zhengliang Shi, Shuo Zhang, Weiwei Sun, Shen Gao, Pengjie Ren, Zhumin Chen, Zhaochun Ren

    Abstract: Multi-Hop Question Answering (MHQA) tasks present a significant challenge for large language models (LLMs) due to the intensive knowledge required. Current solutions, like Retrieval-Augmented Generation, typically retrieve potential documents from an external corpus to read an answer. However, the performance of this retrieve-then-read paradigm is constrained by the retriever and the inevitable no… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: ACL 2024 (main conference)

  10. arXiv:2406.14887  [pdf, other

    cs.CL

    InternLM-Law: An Open Source Chinese Legal Large Language Model

    Authors: Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge

    Abstract: While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field. In this paper, we introduce InternLM-Law, a specialized LLM tailored for addressing diverse legal queries related to Chinese laws, spanning from responding to standard legal questions (e.g., l… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Our dataset, code and models will be released at https://github.com/InternLM/InternLM-Law

  11. Towards Timely Video Analytics Services at the Network Edge

    Authors: Xishuo Li, Shan Zhang, Yuejiao Huang, Xiao Ma, Zhiyuan Wang, Hongbin Luo

    Abstract: Real-time video analytics services aim to provide users with accurate recognition results timely. However, existing studies usually fall into the dilemma between reducing delay and improving accuracy. The edge computing scenario imposes strict transmission and computation resource constraints, making balancing these conflicting metrics under dynamic network conditions difficult. In this regard, we… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  12. arXiv:2406.14544  [pdf, other

    cs.CV cs.CL

    Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

    Authors: Yuxuan Qiao, Haodong Duan, Xinyu Fang, Junming Yang, Lin Chen, Songyang Zhang, Jiaqi Wang, Dahua Lin, Kai Chen

    Abstract: Vision Language Models (VLMs) demonstrate remarkable proficiency in addressing a wide array of visual questions, which requires strong perception and reasoning faculties. Assessing these two competencies independently is crucial for model refinement, despite the inherent difficulty due to the intertwined nature of seeing and reasoning in existing VLMs. To tackle this issue, we present Prism, an in… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  13. arXiv:2406.14042  [pdf, other

    cond-mat.quant-gas

    Synthetic spin-orbit coupling for the multi-spin models in optical lattices

    Authors: Zhen Zheng, Yan-Qing Zhu, Shanchao Zhang, Shi-Liang Zhu, Z. D. Wang

    Abstract: The essential role of synthetic spin-orbit coupling in discovering new topological matter phases with cold atoms is widely acknowledged. However, the engineering of spin-orbit coupling remains unclear for arbitrary-spin models due to the complexity of spin matrices. In this work, we develop a more general but relatively straightforward method to achieve spin-orbit coupling for multi-spin models. O… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures

  14. arXiv:2406.13702  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Van-Hove annihilation and nematic instability on a Kagome lattice

    Authors: Yu-Xiao Jiang, Sen Shao, Wei Xia, M. Michael Denner, Julian Ingham, Md Shafayat Hossain, Qingzheng Qiu, Xiquan Zheng, Hongyu Chen, Zi-Jia Cheng, Xian P. Yang, Byunghoon Kim, Jia-Xin Yin, Songbo Zhang, Maksim Litskevich, Qi Zhang, Tyler A. Cochran, Yingying Peng, Guoqing Chang, Yanfeng Guo, Ronny Thomale, Titus Neupert, M. Zahid Hasan

    Abstract: Novel states of matter arise in quantum materials due to strong interactions among electrons. A nematic phase breaks the point group symmetry of the crystal lattice and is known to emerge in correlated materials. Here we report the observation of an intra-unit-cell nematic order and signatures of Pomeranchuk instability in the Kagome metal ScV6Sn6. Using scanning tunneling microscopy and spectrosc… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 19 pages, 5 figures, accepted for publication in Nature materials

  15. arXiv:2406.13674  [pdf, other

    eess.IV cs.CV

    Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases

    Authors: Xiangde Luo, Zihan Li, Shaoting Zhang, Wenjun Liao, Guotai Wang

    Abstract: Deep learning has enabled great strides in abdominal multi-organ segmentation, even surpassing junior oncologists on common cases or organs. However, robustness on corner cases and complex organs remains a challenging open problem for clinical adoption. To investigate model robustness, we collected and annotated the RAOS dataset comprising 413 CT scans ($\sim$80k 2D images, $\sim$8k 3D organ annot… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 10 pages, 1 figure, 6 tables, Early Accept to MICCAI 2024

  16. arXiv:2406.13511  [pdf, other

    cs.DC

    Slice-Level Scheduling for High Throughput and Load Balanced LLM Serving

    Authors: Ke Cheng, Wen Hu, Zhi Wang, Hongen Peng, Jianguo Li, Sheng Zhang

    Abstract: Large language models (LLMs) iteratively generate text token by token, with memory usage increasing with the length of generated token sequences. The unpredictability of generation lengths makes it difficult to estimate the time and memory needed to process requests, posing a challenge for effective request scheduling. Conventional sequence-level scheduling (SLS) serves requests in a first-come fi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 13 pages, 22 figures

  17. arXiv:2406.13268  [pdf, other

    eess.AS cs.SD

    CEC: A Noisy Label Detection Method for Speaker Recognition

    Authors: Yao Shen, Yingying Gao, Yaqian Hao, Chenguang Hu, Fulin Zhang, Junlan Feng, Shilei Zhang

    Abstract: Noisy labels are inevitable, even in well-annotated datasets. The detection of noisy labels is of significant importance to enhance the robustness of speaker recognition models. In this paper, we propose a novel noisy label detection approach based on two new statistical metrics: Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC). These metrics are calculated through Cros… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: interspeech 2024

  18. arXiv:2406.13171  [pdf

    physics.optics

    Super-resolution 3D tomography of vector near-fields in dielectric resonators

    Authors: Bingbing Zhu, Qingnan Cai, Yaxin Liu, Sheng Zhang, Weifeng Liu, Qiong He, Lei Zhou, Zhensheng Tao

    Abstract: All-dielectric optical resonators, exhibiting exotic near-field distributions upon excitations, have emerged as low-loss, versatile and highly adaptable components in nanophotonic structures for manipulating electromagnetic waves and enhancing light-matter interactions. However, achieving experimental full three-dimensional characterization of near-fields within dielectric materials poses signific… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 26 pages, 4 figures

  19. arXiv:2406.12867  [pdf, ps, other

    math.QA

    Classification of quasi-affine Generalized Dynkin Diagrams with Rank $3$ and Rank $2$

    Authors: Zhengtang Tan, Shouchuan Zhang

    Abstract: All quasi-affine connected Generalized Dynkin Diagram with rank $= 3$ and $2$ are found. All quasi-affine Nichols (Lie braided) algebras with rank $ 3$ and $2$ are also found.

    Submitted 9 April, 2024; originally announced June 2024.

    Comments: 338 pages

    MSC Class: 16W30; 16G10

  20. arXiv:2406.12793  [pdf, other

    cs.CL

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, **g Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang , et al. (32 additional authors not shown)

    Abstract: We introduce ChatGLM, an evolving family of large language models that we have been develo** over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  21. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  22. arXiv:2406.12443  [pdf, other

    cs.RO

    Robustness Testing of Multi-Modal Models in Varied Home Environments for Assistive Robots

    Authors: Lea Hirlimann, Shengqiang Zhang, Hinrich Schütze, Philipp Wicke

    Abstract: The development of assistive robotic agents to support household tasks is advancing, yet the underlying models often operate in virtual settings that do not reflect real-world complexity. For assistive care robots to be effective in diverse environments, their models must be robust and integrate multiple modalities. Consider a caretaker needing assistance in a dimly lit room or navigating around a… ▽ More

    Submitted 19 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Geriatronics Summit 2024, July 09 - 10, Garmisch-Partenkirchen Congress Center

  23. arXiv:2406.12199  [pdf, other

    cs.LG cs.AI

    Time Series Modeling for Heart Rate Prediction: From ARIMA to Transformers

    Authors: Haowei Ni, Shuchen Meng, Xieming Geng, Panfeng Li, Zhuoying Li, Xupeng Chen, Xiaotong Wang, Shiyao Zhang

    Abstract: Cardiovascular disease (CVD) is a leading cause of death globally, necessitating precise forecasting models for monitoring vital signs like heart rate, blood pressure, and ECG. Traditional models, such as ARIMA and Prophet, are limited by their need for manual parameter tuning and challenges in handling noisy, sparse, and highly variable medical data. This study investigates advanced deep learning… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by 2024 6th International Conference on Electronic Engineering and Informatics

  24. arXiv:2406.12111  [pdf, other

    hep-ex

    Precision measurement of the $Ξ^-_b$ baryon lifetime

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

    Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

  25. arXiv:2406.12020  [pdf, other

    cs.IR cs.AI

    When Box Meets Graph Neural Network in Tag-aware Recommendation

    Authors: Fake Lin, Ziwei Zhao, Xi Zhu, Da Zhang, Shitian Shen, Xueying Li, Tong Xu, Suojuan Zhang, Enhong Chen

    Abstract: Last year has witnessed the re-flourishment of tag-aware recommender systems supported by the LLM-enriched tags. Unfortunately, though large efforts have been made, current solutions may fail to describe the diversity and uncertainty inherent in user preferences with only tag-driven profiles. Recently, with the development of geometry-based techniques, e.g., box embedding, diversity of user prefer… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  26. arXiv:2406.12014  [pdf, other

    astro-ph.HE

    An IXPE-Led X-ray Spectro-Polarimetric Campaign on the Soft State of Cygnus X-1: X-ray Polarimetric Evidence for Strong Gravitational Lensing

    Authors: James F. Steiner, Edward Nathan, Kun Hu, Henric Krawczynski, Michal Dovciak, Alexandra Veledina, Fabio Muleri, Jiri Svoboda, Kevin Alabarta, Maxime Parra, Yash Bhargava, Giorgio Matt, Juri Poutanen, Pierre-Olivier Petrucci, Allyn F. Tennant, M. Cristina Baglio, Luca Baldini, Samuel Barnier, Sudip Bhattacharyya, Stefano Bianchi, Maimouna Brigitte, Mauricio Cabezas, Floriane Cangemi, Fiamma Capitanio, Jacob Casey , et al. (112 additional authors not shown)

    Abstract: We present the first X-ray spectropolarimetric results for Cygnus X-1 in its soft state from a campaign of five IXPE observations conducted during 2023 May-June. Companion multiwavelength data during the campaign are likewise shown. The 2-8 keV X-rays exhibit a net polarization degree PD=1.99%+/-0.13% (68% confidence). The polarization signal is found to increase with energy across IXPE's 2-8 keV… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 20 pages, accepted for publication in ApJL

  27. arXiv:2406.11998  [pdf, other

    math.AT

    Stability of Persistent Path Diagrams

    Authors: Shen Zhang

    Abstract: In real-world systems, the relationships and connections between components are highly complex. Real systems are often described as networks, where nodes represent objects in the system and edges represent relationships or connections between nodes. With the deepening of research, networks have been endowed with richer structures, such as directed edges, edge weights, and even hyperedges involving… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  28. arXiv:2406.11900  [pdf, other

    q-bio.QM cs.AI cs.LG

    Horizon-wise Learning Paradigm Promotes Gene Splicing Identification

    Authors: Qi-Jie Li, Qian Sun, Shao-Qun Zhang

    Abstract: Identifying gene splicing is a core and significant task confronted in modern collaboration between artificial intelligence and bioinformatics. Past decades have witnessed great efforts on this concern, such as the bio-plausible splicing pattern AT-CG and the famous SpliceAI. In this paper, we propose a novel framework for the task of gene splicing identification, named Horizon-wise Gene Splicing… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  29. arXiv:2406.11891  [pdf, other

    cs.SI cs.AI cs.LG

    Towards Adaptive Neighborhood for Advancing Temporal Interaction Graph Modeling

    Authors: Siwei Zhang, Xi Chen, Yun Xiong, Xixi Wu, Yao Zhang, Yongrui Fu, Yinglong Zhao, Jiawei Zhang

    Abstract: Temporal Graph Networks (TGNs) have demonstrated their remarkable performance in modeling temporal interaction graphs. These works can generate temporal node representations by encoding the surrounding neighborhoods for the target node. However, an inherent limitation of existing TGNs is their reliance on fixed, hand-crafted rules for neighborhood encoding, overlooking the necessity for an adaptiv… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: KDD'2024 Research Track Paper

  30. arXiv:2406.11884  [pdf, other

    cs.SI cs.AI

    Hierarchical Compression of Text-Rich Graphs via Large Language Models

    Authors: Shichang Zhang, Da Zheng, Jiani Zhang, Qi Zhu, Xiang song, Soji Adeshina, Christos Faloutsos, George Karypis, Yizhou Sun

    Abstract: Text-rich graphs, prevalent in data mining contexts like e-commerce and academic graphs, consist of nodes with textual features linked by various relations. Traditional graph machine learning models, such as Graph Neural Networks (GNNs), excel in encoding the graph structural information, but have limited capability in handling rich text on graph nodes. Large Language Models (LLMs), noted for thei… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  31. arXiv:2406.11839  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    mDPO: Conditional Preference Optimization for Multimodal Large Language Models

    Authors: Fei Wang, Wenxuan Zhou, James Y. Huang, Nan Xu, Sheng Zhang, Hoifung Poon, Muhao Chen

    Abstract: Direct preference optimization (DPO) has shown to be an effective method for large language model (LLM) alignment. Recent works have attempted to apply DPO to multimodal scenarios but have found it challenging to achieve consistent improvement. Through a comparative experiment, we identify the unconditional preference problem in multimodal preference optimization, where the model overlooks the ima… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  32. arXiv:2406.11827  [pdf, other

    cs.CL cs.AI cs.LG

    WPO: Enhancing RLHF with Weighted Preference Optimization

    Authors: Wenxuan Zhou, Ravi Agrawal, Shujian Zhang, Sathish Reddy Indurthi, Sanqiang Zhao, Kaiqiang Song, Silei Xu, Chenguang Zhu

    Abstract: Reinforcement learning from human feedback (RLHF) is a promising solution to align large language models (LLMs) more closely with human values. Off-policy preference optimization, where the preference data is obtained from other models, is widely adopted due to its cost efficiency and scalability. However, off-policy preference optimization often suffers from a distributional gap between the polic… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  33. arXiv:2406.11788  [pdf, other

    quant-ph cond-mat.dis-nn cond-mat.stat-mech

    Holographic Classical Shadow Tomography

    Authors: Shuhan Zhang, Xiaozhou Feng, Matteo Ippoliti, Yi-Zhuang You

    Abstract: We introduce "holographic shadows", a new class of randomized measurement schemes for classical shadow tomography that achieves the optimal scaling of sample complexity for learning geometrically local Pauli operators at any length scale, without the need for fine-tuning protocol parameters such as circuit depth or measurement rate. Our approach utilizes hierarchical quantum circuits, such as tree… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 20 pages, 9 figures

  34. arXiv:2406.11274  [pdf, other

    cs.CL

    Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers

    Authors: Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang

    Abstract: The Transformer architecture has significantly advanced deep learning, particularly in natural language processing, by effectively managing long-range dependencies. However, as the demand for understanding complex relationships grows, refining the Transformer's architecture becomes critical. This paper introduces Skip-Layer Attention (SLA) to enhance Transformer models by enabling direct attention… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 7 pages, 1 figure

  35. Low-Energy Electronic Structure in the Unconventional Charge-Ordered State of ScV$_6$Sn$_6$

    Authors: Asish K. Kundu, Xiong Huang, Eric Seewald, Ethan Ritz, Santanu Pakhira, Shuai Zhang, Dihao Sun, Simon Turkel, Sara Shabani, Turgut Yilmaz, Elio Vescovo, Cory R. Dean, David C. Johnston, Tonica Valla, Turan Birol, Dmitri N. Basov, Rafael M. Fernandes, Abhay N. Pasupathy

    Abstract: Kagome vanadates {\it A}V$_3$Sb$_5$ display unusual low-temperature electronic properties including charge density waves (CDW), whose microscopic origin remains unsettled. Recently, CDW order has been discovered in a new material ScV$_6$Sn$_6$, providing an opportunity to explore whether the onset of CDW leads to unusual electronic properties. Here, we study this question using angle-resolved phot… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 33 pages, 4 figures

    Journal ref: Nat. Commun. 15, 5008 (2024)

  36. arXiv:2406.11211  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.supr-con

    Quantized Andreev conductance in semiconductor nanowires

    Authors: Yichun Gao, Wenyu Song, Yuhao Wang, Zuhan Geng, Zhan Cao, Zehao Yu, Shuai Yang, Jiaye Xu, Fangting Chen, Zonglin Li, Ruidong Li, Lining Yang, Zhaoyu Wang, Shan Zhang, Xiao Feng, Tiantian Wang, Yunyi Zang, Lin Li, Dong E. Liu, Runan Shang, Qi-Kun Xue, Ke He, Hao Zhang

    Abstract: Clean one-dimensional electron systems can exhibit quantized conductance. The plateau conductance doubles if the transport is dominated by Andreev reflection. Here, we report quantized conductance observed in both Andreev and normal-state transports in PbTe-Pb and PbTe-In hybrid nanowires. The Andreev plateau is observed at $4e^2/h$, twice of the normal plateau value of $2e^2/h$. In comparison, An… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  37. arXiv:2406.11169   

    eess.AS cs.SD

    Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision

    Authors: Yafeng Chen, Siqi Zheng, Hui Wang, Luyao Cheng, Qian Chen, Shiliang Zhang, Wen Wang

    Abstract: Training speaker-discriminative and robust speaker verification systems without explicit speaker labels remains a persisting challenge. In this paper, we propose a new self-supervised speaker verification approach, Self-Distillation Prototypes Network (SDPN), which effectively facilitates self-supervised speaker representation learning. SDPN assigns the representation of the augmented views of an… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: We update this paper to an earlier paper

  38. arXiv:2406.10855  [pdf, other

    cs.CV cs.AI

    ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model

    Authors: Song Zhang, Qingzhong Wang, Junyi Liu, Haoyi Xiong

    Abstract: In the fast-growing field of Remote Sensing (RS) image analysis, the gap between massive unlabeled datasets and the ability to fully utilize these datasets for advanced RS analytics presents a significant challenge. To fill the gap, our work introduces an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Segmentation), leveraging the Segment Anything Model (SAM)… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  39. arXiv:2406.10744   

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Sheng** Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou , et al. (77 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: The author list and contents need to be verified by all authors

  40. arXiv:2406.10729  [pdf, other

    cs.LG cs.AI cs.CV

    A Comprehensive Survey of Foundation Models in Medicine

    Authors: Wasif Khan, Seowung Leem, Kyle B. See, Joshua K. Wong, Shaoting Zhang, Ruogu Fang

    Abstract: Foundation models (FMs) are large-scale deep-learning models trained on extensive datasets using self-supervised techniques. These models serve as a base for various downstream tasks, including healthcare. FMs have been adopted with great success across various domains within healthcare, including natural language processing (NLP), computer vision, graph learning, biology, and omics. Existing heal… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 44 pages, and a more compact version is under review

  41. arXiv:2406.10619  [pdf

    physics.optics physics.data-an

    Transient Measurement of Near-field Thermal Radiation between Macroscopic Objects

    Authors: Sen Zhang, Yongdi Dang, Xinran Li, Yuxuan Li, Yi **, Pankaj K Choudhury, Jianbing Xu, Yungui Ma

    Abstract: The involvement of evanescent waves in the near-field regime could greatly enhance the spontaneous thermal radiation, offering a unique opportunity to study nanoscale photon-phonon interaction. However, accurately characterizing this subtle phenomenon is very challenging. This paper proposes a transient all-optical method for rapidly characterizing near-field radiative heat transfer (NFRHT) betwee… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  42. arXiv:2406.10591  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

    Authors: Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li

    Abstract: Foley audio, critical for enhancing the immersive experience in multimedia content, faces significant challenges in the AI-generated content (AIGC) landscape. Despite advancements in AIGC technologies for text and image generation, the foley audio dubbing remains rudimentary due to difficulties in cross-modal scene matching and content correlation. Current text-to-audio technology, which relies on… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  43. arXiv:2406.10555  [pdf, ps, other

    math.OC

    Statistical Robustness of Kernel Learning Estimator with Respect to Data Perturbation

    Authors: Sainan Zhang, Huifu Xu, Hailin Sun

    Abstract: Inspired by the recent work [28] on the statistical robustness of empirical risks in reproducing kernel Hilbert space (RKHS) where the training data are potentially perturbed or even corrupted, we take a step further in this paper to investigate the statistical robustness of the kernel learning estimator (the regularized empirical risk minimizer or stationary point). We begin by deriving qualitati… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  44. arXiv:2406.10489  [pdf, ps, other

    math.AP math.DG

    Classification of solutions to a biharmonic equation on the half-space and ball with pairs of conformal boundary operators

    Authors: Xuezhang Chen, Shihong Zhang

    Abstract: We introduce the notion of a biharmonic Poisson kernel associated with certain pair of conformal boundary operators and present its explicit formula. With this powerful tool, we next establish the classification theorems of nonnegative solutions to a biharmonic equation on the upper half-space and unit ball with proper pairs of conformal boundary operators.

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 49 pages

  45. arXiv:2406.10478  [pdf, other

    cs.CL cs.AI cs.GR

    From Words to Worlds: Transforming One-line Prompt into Immersive Multi-modal Digital Stories with Communicative LLM Agent

    Authors: Samuel S. Sohn, Danrui Li, Sen Zhang, Che-Jui Chang, Mubbasir Kapadia

    Abstract: Digital storytelling, essential in entertainment, education, and marketing, faces challenges in production scalability and flexibility. The StoryAgent framework, introduced in this paper, utilizes Large Language Models and generative tools to automate and refine digital storytelling. Employing a top-down story drafting and bottom-up asset generation approach, StoryAgent tackles key issues such as… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 16 pages, 13 figures

  46. arXiv:2406.10277  [pdf

    physics.class-ph physics.optics

    Tellegen responses in metamaterials

    Authors: Qingdong Yang, Xinhua Wen, Zhongfu Li, Oubo You, Shuang Zhang

    Abstract: Tellegen medium has long been a topic of debate, with its existence being contested over several decades. It was first proposed by Tellegen in 1948 and is characterized by a real-valued cross coupling between electric and magnetic responses, distinguishing it from the well-known chiral medium that has imaginary coupling coefficients. Significantly, Tellegen responses are closely linked to axion dy… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 19 pages, 4 figures

  47. arXiv:2406.10252  [pdf, other

    cs.IR cs.AI cs.CL

    AutoSurvey: Large Language Models Can Automatically Write Surveys

    Authors: Yidong Wang, Qi Guo, Wen** Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang

    Abstract: This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence. Traditional survey paper creation faces challenges due to the vast volume and complexity of information, prompting the need for efficient survey methods. While large language models (LLMs) offer promise in… ▽ More

    Submitted 17 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  48. arXiv:2406.10244  [pdf, other

    cs.IR cs.AI

    GLINT-RU: Gated Lightweight Intelligent Recurrent Units for Sequential Recommender Systems

    Authors: Sheng Zhang, Maolin Wang, Xiangyu Zhao

    Abstract: In the rapidly evolving field of artificial intelligence, transformer-based models have gained significant attention in the context of Sequential Recommender Systems (SRSs), demonstrating remarkable proficiency in capturing user-item interactions. However, such attention-based frameworks result in substantial computational overhead and extended inference time. To address this problem, this paper p… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  49. arXiv:2406.10085  [pdf, other

    cs.CL

    Enhancing Question Answering on Charts Through Effective Pre-training Tasks

    Authors: Ashim Gupta, Vivek Gupta, Shuo Zhang, Yujie He, Ning Zhang, Shalin Shah

    Abstract: To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the li… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  50. arXiv:2406.09813  [pdf, other

    astro-ph.IM astro-ph.HE

    Diffuse X-ray Explorer: a high-resolution X-ray spectroscopic sky surveyor on the China Space Station

    Authors: Hai **, Junjie Mao, Liubiao Chen, Naihui Chen, Wei Cui, Bo Gao, **** Li, Xinfeng Li, Jiejia Liu, Jia Quan, Chunyang Jiang, Guole Wang, Le Wang, Qian Wang, Sifan Wang, Aimin Xiao, Shuo Zhang

    Abstract: DIffuse X-ray Explorer (DIXE) is a proposed high-resolution X-ray spectroscopic sky surveyor on the China Space Station (CSS). DIXE will focus on studying hot baryons in the Milky Way. Galactic hot baryons like the X-ray emitting Milky Way halo and eROSITA bubbles are best observed in the sky survey mode with a large field of view. DIXE will take advantage of the orbital motion of the CSS to scan… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures, the full version is published by Journal of Low Temperature Physics