Skip to main content

Showing 1–50 of 18,986 results for author: Zhang., Y

.
  1. arXiv:2407.09429  [pdf, other

    cs.CL

    Open (Clinical) LLMs are Sensitive to Instruction Phrasings

    Authors: Alberto Mario Ceballos Arroyo, Monica Munnangi, Jiuding Sun, Karen Y. C. Zhang, Denis Jered McInerney, Byron C. Wallace, Silvio Amir

    Abstract: Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain. This raises a… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: To appear at BioNLP, ACL 2024

  2. arXiv:2407.09357  [pdf, other

    cs.LG q-bio.BM

    Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees

    Authors: Alexia Jolicoeur-Martineau, Aristide Baratin, Kisoo Kwon, Boris Knyazev, Yan Zhang

    Abstract: Generating novel molecules is challenging, with most representations leading to generative models producing many invalid molecules. Spanning Tree-based Graph Generation (STGG) is a promising approach to ensure the generation of valid molecules, outperforming state-of-the-art SMILES and graph diffusion models for unconditional generation. In the real world, we want to be able to generate molecules… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  3. arXiv:2407.09292  [pdf, other

    cs.CR

    CEIPA: Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models

    Authors: Dong Shu, Mingyu **, Tianle Chen, Chong Zhang, Yongfeng Zhang

    Abstract: This study sheds light on the imperative need to bolster safety and privacy measures in large language models (LLMs), such as GPT-4 and LLaMA-2, by identifying and mitigating their vulnerabilities through explainable analysis of prompt attacks. We propose Counterfactual Explainable Incremental Prompt Attack (CEIPA), a novel technique where we guide prompts in a specific manner to quantitatively me… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 23 pages, 6 figures

  4. arXiv:2407.09139  [pdf, other

    hep-ex

    Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II

    Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (414 additional authors not shown)

    Abstract: We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 10 pages, 4 figures

    Report number: Belle II Preprint 2024-009, KEK Preprint 2024-1

  5. arXiv:2407.09026  [pdf, other

    cs.CV cs.LG cs.MM eess.IV

    HPC: Hierarchical Progressive Coding Framework for Volumetric Video

    Authors: Zihan Zheng, Houqiang Zhong, Qiang Hu, Xiaoyun Zhang, Li Song, Ya Zhang, Yanfeng Wang

    Abstract: Volumetric video based on Neural Radiance Field (NeRF) holds vast potential for various 3D applications, but its substantial data volume poses significant challenges for compression and transmission. Current NeRF compression lacks the flexibility to adjust video quality and bitrate within a single model for various network and device capacities. To address these issues, we propose HPC, a novel hie… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 11 pages, 7 figures

  6. arXiv:2407.09018  [pdf, other

    cs.SE

    AUITestAgent: Automatic Requirements Oriented GUI Function Testing

    Authors: Yongxiang Hu, Xuan Wang, Yingchuan Wang, Yu Zhang, Shiyu Guo, Chaoyi Chen, Xin Wang, Yangfan Zhou

    Abstract: The Graphical User Interface (GUI) is how users interact with mobile apps. To ensure it functions properly, testing engineers have to make sure it functions as intended, based on test requirements that are typically written in natural language. While widely adopted manual testing and script-based methods are effective, they demand substantial effort due to the vast number of GUI pages and rapid it… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  7. arXiv:2407.08990  [pdf, other

    cs.AR cs.AI cs.ET cs.NE

    Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

    Authors: Yue Zhang, Woyu Zhang, Shaocong Wang, Ning Lin, Yifei Yu, Yangu He, Bo Wang, Hao Jiang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

    Abstract: The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: In press

  8. arXiv:2407.08984  [pdf, ps, other

    hep-ex

    Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data

    Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (385 additional authors not shown)

    Abstract: We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 12 pages, 4 figures

    Report number: Belle II Preprint 2023-019; KEK Preprint 2023-37

  9. arXiv:2407.08972  [pdf, other

    cs.CV

    Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness

    Authors: Honghao Chen, Yurong Zhang, Xiaokun Feng, Xiangxiang Chu, Kaiqi Huang

    Abstract: Robustness is a vital aspect to consider when deploying deep learning models into the wild. Numerous studies have been dedicated to the study of the robustness of vision transformers (ViTs), which have dominated as the mainstream backbone choice for vision tasks since the dawn of 2020s. Recently, some large kernel convnets make a comeback with impressive performance and efficiency. However, it sti… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  10. arXiv:2407.08966  [pdf, other

    cs.CV cs.AI cs.LG

    LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models

    Authors: Yabin Zhang, Wenjie Zhu, Chenhang He, Lei Zhang

    Abstract: Out-of-distribution (OOD) detection is crucial for model reliability, as it identifies samples from unknown classes and reduces errors due to unexpected inputs. Vision-Language Models (VLMs) such as CLIP are emerging as powerful tools for OOD detection by integrating multi-modal information. However, the practical application of such systems is challenged by manual prompt engineering, which demand… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ECCV2024; Codes and Supp. are available at: https://github.com/YBZh/LAPT

  11. arXiv:2407.08965  [pdf, other

    cs.CV cs.LG

    Lite-SAM Is Actually What You Need for Segment Everything

    Authors: Jianhai Fu, Yuanjie Yu, Ningchuan Li, Yi Zhang, Qichao Chen, Jun Yin, Zhiyu Xiang

    Abstract: This paper introduces Lite-SAM, an efficient end-to-end solution for the SegEvery task designed to reduce computational costs and redundancy. Lite-SAM is composed of four main components: a streamlined CNN-Transformer hybrid encoder (LiteViT), an automated prompt proposal network (AutoPPN), a traditional prompt encoder, and a mask decoder. All these components are integrated within the SAM framewo… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ECCV 2024 Accepted

  12. arXiv:2407.08952  [pdf, other

    cs.CL cs.AI

    Detect, Investigate, Judge and Determine: A Novel LLM-based Framework for Few-shot Fake News Detection

    Authors: Ye Liu, Jiajun Zhu, Kai Zhang, Haoyu Tang, Yanghai Zhang, Xukai Liu, Qi Liu, Enhong Chen

    Abstract: Few-Shot Fake News Detection (FS-FND) aims to distinguish inaccurate news from real ones in extremely low-resource scenarios. This task has garnered increased attention due to the widespread dissemination and harmful impact of fake news on social media. Large Language Models (LLMs) have demonstrated competitive performance with the help of their rich prior knowledge and excellent in-context learni… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  13. arXiv:2407.08739  [pdf, other

    cs.CV

    MAVIS: Mathematical Visual Instruction Tuning

    Authors: Renrui Zhang, Xinyu Wei, Dongzhi Jiang, Yichi Zhang, Ziyu Guo, Chengzhuo Tong, Jiaming Liu, Aojun Zhou, Bin Wei, Shanghang Zhang, Peng Gao, Hongsheng Li

    Abstract: Multi-modal Large Language Models (MLLMs) have recently emerged as a significant focus in academia and industry. Despite their proficiency in general multi-modal scenarios, the mathematical problem-solving capabilities in visual contexts remain insufficiently explored. We identify three key areas within MLLMs that need to be improved: visual encoding of math diagrams, diagram-language alignment, a… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Work in progress. Data and Models are released at https://github.com/ZrrSkywalker/MAVIS

  14. arXiv:2407.08672  [pdf, other

    cs.CV

    NODE-Adapter: Neural Ordinary Differential Equations for Better Vision-Language Reasoning

    Authors: Yi Zhang, Chun-Wun Cheng, Ke Yu, Zhihai He, Carola-Bibiane Schönlieb, Angelica I. Aviles-Rivero

    Abstract: In this paper, we consider the problem of prototype-based vision-language reasoning problem. We observe that existing methods encounter three major challenges: 1) escalating resource demands and prolonging training times, 2) contending with excessive learnable parameters, and 3) fine-tuning based only on a single modality. These challenges will hinder their capability to adapt Vision-Language Mode… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  15. arXiv:2407.08641  [pdf, other

    cs.LG cs.NE math.DS nlin.AO

    How more data can hurt: Instability and regularization in next-generation reservoir computing

    Authors: Yuanzhao Zhang, Sean P. Cornelius

    Abstract: It has been found recently that more data can, counter-intuitively, hurt the performance of deep neural networks. Here, we show that a more extreme version of the phenomenon occurs in data-driven models of dynamical systems. To elucidate the underlying mechanism, we focus on next-generation reservoir computing (NGRC) -- a popular framework for learning dynamics from data. We find that, despite lea… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 10 pages, 10 figures. Comments welcome

  16. arXiv:2407.08608  [pdf, other

    cs.LG cs.AI

    FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

    Authors: Jay Shah, Ganesh Bikshandi, Ying Zhang, Vijay Thakkar, Pradeep Ramani, Tri Dao

    Abstract: Attention, as a core layer of the ubiquitous Transformer architecture, is the bottleneck for large language models and long-context applications. FlashAttention elaborated an approach to speed up attention on GPUs through minimizing memory reads/writes. However, it has yet to take advantage of new capabilities present in recent hardware, with FlashAttention-2 achieving only 35% utilization on the… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  17. arXiv:2407.08560  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Causal inference through multi-stage learning and doubly robust deep neural networks

    Authors: Yuqian Zhang, Jelena Bradic

    Abstract: Deep neural networks (DNNs) have demonstrated remarkable empirical performance in large-scale supervised learning problems, particularly in scenarios where both the sample size $n$ and the dimension of covariates $p$ are large. This study delves into the application of DNNs across a wide spectrum of intricate causal inference tasks, where direct estimation falls short and necessitates multi-stage… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  18. arXiv:2407.08559  [pdf

    physics.app-ph

    Study of a Novel Capacitive Pressure Sensor Using Spiral Comb Electrodes

    Authors: Wenjie Chen, Qi Yang, Qi Liu, Yiqun Zhang, Liang He, Yuanlin Xia, Zhuqing Wang, Yubo Huang, Jianfeng Chen, Cao Xia

    Abstract: For traditional capacitive pressure sensors, high nonlinearity and poor sensitivity greatly limited their sensing applications. Hence, an innovative design of capacitors based on spiral comb electrodes is proposed for high-sensitivity pressure detection in this work. Compared to traditional capacitive pressure sensors with straight plate electrodes, the proposed sensor with the spiral electrodes i… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 20 pages, 14 figures

    MSC Class: -

  19. arXiv:2407.08546  [pdf, other

    cs.CV cs.LG q-bio.QM

    Quantitative Evaluation of the Saliency Map for Alzheimer's Disease Classifier with Anatomical Segmentation

    Authors: Yihan Zhang, Xuanshuo Zhang, Wei Wu, Haohan Wang

    Abstract: Saliency maps have been widely used to interpret deep learning classifiers for Alzheimer's disease (AD). However, since AD is heterogeneous and has multiple subtypes, the pathological mechanism of AD remains not fully understood and may vary from patient to patient. Due to the lack of such understanding, it is difficult to comprehensively and effectively assess the saliency map of AD classifier. I… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  20. arXiv:2407.08532  [pdf, other

    cs.CR cs.SE

    Tactics, Techniques, and Procedures (TTPs) in Interpreted Malware: A Zero-Shot Generation with Large Language Models

    Authors: Ying Zhang, Xiaoyan Zhou, Hui Wen, Wenjia Niu, Jiqiang Liu, Haining Wang, Qiang Li

    Abstract: Nowadays, the open-source software (OSS) ecosystem suffers from security threats of software supply chain (SSC) attacks. Interpreted OSS malware plays a vital role in SSC attacks, as criminals have an arsenal of attack vectors to deceive users into installing malware and executing malicious activities. In this paper, we introduce tactics, techniques, and procedures (TTPs) proposed by MITRE ATT\&CK… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 19 pages, 11 figures

  21. arXiv:2407.08267  [pdf, other

    physics.optics

    Chiral bulk solitons in photonic graphene with decorated boundaries

    Authors: Shuang Shen, Ce Shang, Yongdong Li, Yiqi Zhang

    Abstract: We propose a chiral bulk soliton in a nonlinear photonic lattice with decorated boundaries, presenting a novel approach to manipulate photonic transport without extensive bulk modifications. Unlike traditional methods that rely on topological edge and corner modes, our strategy leverages the robust chiral propagation of bulk modes. By introducing nonlinearity into the system, we find a stable bulk… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  22. arXiv:2407.08223  [pdf, other

    cs.CL cs.AI

    Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

    Authors: Zilong Wang, Zifeng Wang, Long Le, Huaixiu Steven Zheng, Swaroop Mishra, Vincent Perot, Yuwei Zhang, Anush Mattapalli, Ankur Taly, **gbo Shang, Chen-Yu Lee, Tomas Pfister

    Abstract: Retrieval augmented generation (RAG) combines the generative abilities of large language models (LLMs) with external knowledge sources to provide more accurate and up-to-date responses. Recent RAG advancements focus on improving retrieval outcomes through iterative LLM refinement or self-critique capabilities acquired through additional instruction tuning of LLMs. In this work, we introduce Specul… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Preprint

  23. arXiv:2407.08202  [pdf

    physics.geo-ph

    A rotational ellipsoid model for solid Earth tide with high precision

    Authors: Yongfeng Yang, Yunfei Zhang, Qiang Liu, Xianqing Lv, Pu Huang

    Abstract: Solid Earth tide represents the elastic response of solid Earth to the lunar (solar) gravitational force. The yielding solid Earth due to the force has been thought to be a prolate ellipsoid since the time of Lord Kelvin, yet the ellipsoid's geometry such as semi-major axis's length, semi-minor axis's length, and oblateness remains unresolved. Additionally, the tidal displacement of solid Earth is… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 20 pages, 4 figures, 1 table

  24. arXiv:2407.08199  [pdf, other

    cs.CV

    SRPose: Two-view Relative Pose Estimation with Sparse Keypoints

    Authors: Rui Yin, Yulun Zhang, Zherong Pan, Jianjun Zhu, Cheng Wang, Biao Jia

    Abstract: Two-view pose estimation is essential for map-free visual relocalization and object pose tracking tasks. However, traditional matching methods suffer from time-consuming robust estimators, while deep learning-based pose regressors only cater to camera-to-world pose estimation, lacking generalizability to different image sizes and camera intrinsics. In this paper, we propose SRPose, a sparse keypoi… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 30 pages, 11 figures, to be published in ECCV 2024

  25. arXiv:2407.08187  [pdf, other

    cs.CV

    ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

    Authors: Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang

    Abstract: Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing acros… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figure, 13 tables

  26. arXiv:2407.08167  [pdf, other

    eess.IV cs.CV

    DSCENet: Dynamic Screening and Clinical-Enhanced Multimodal Fusion for MPNs Subtype Classification

    Authors: Yuan Zhang, Yaolei Qi, Xiaoming Qi, Yongyue Wei, Guanyu Yang

    Abstract: The precise subtype classification of myeloproliferative neoplasms (MPNs) based on multimodal information, which assists clinicians in diagnosis and long-term treatment plans, is of great clinical significance. However, it remains a great challenging task due to the lack of diagnostic representativeness for local patches and the absence of diagnostic-relevant features from a single modality. In th… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI2024

  27. arXiv:2407.08043  [pdf, other

    quant-ph

    Spin/Phonon Dynamics in Single Molecular Magnets: I. quantum embedding

    Authors: Nosheen Younas, Yu Zhang, Andrei Piryatinski, Eric R Bittner

    Abstract: Single molecular magnets (SMMs) and Metal-Organic Frameworks (MOFs) attract significant interest due to their potential in quantum information processing, scalable quantum computing, and extended lifetimes and coherence times. The limiting factor in these systems is often the spin dephasing caused by interactions and couplings with the vibrational motions of the molecular framework. This work intr… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  28. arXiv:2407.08039  [pdf, other

    cs.CL

    Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models

    Authors: Yuji Zhang, Sha Li, Jiateng Liu, Pengfei Yu, Yi R. Fung, **g Li, Manling Li, Heng Ji

    Abstract: Hallucination is often regarded as a major impediment for using large language models (LLMs), especially for knowledge-intensive tasks. Even when the training corpus consists solely of true statements, language models still generate hallucinations in the form of amalgamations of multiple facts. We coin this phenomenon as ``knowledge overshadowing'': when we query knowledge from a language model wi… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  29. arXiv:2407.08021  [pdf, other

    cs.MA

    Field Deployment of Multi-Agent Reinforcement Learning Based Variable Speed Limit Controllers

    Authors: Yuhang Zhang, Zhiyao Zhang, Marcos Quiñones-Grueiro, William Barbour, Clay Weston, Gautam Biswas, Daniel Work

    Abstract: This article presents the first field deployment of a multi-agent reinforcement-learning (MARL) based variable speed limit (VSL) control system on the I-24 freeway near Nashville, Tennessee. We describe how we train MARL agents in a traffic simulator and directly deploy the simulation-based policy on a 17-mile stretch of Interstate 24 with 67 VSL controllers. We use invalid action masking and seve… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  30. arXiv:2407.07930  [pdf

    q-bio.BM cs.LG

    Token-Mol 1.0: Tokenized drug design with large language model

    Authors: Jike Wang, Rui Qin, Mingyang Wang, Mei**g Fang, Yangyang Zhang, Yuchen Zhu, Qun Su, Qiaolin Gou, Chao Shen, Odin Zhang, Zhenxing Wu, Dejun Jiang, Xujun Zhang, Huifeng Zhao, Xiaozhe Wan, Zhourui Wu, Liwei Liu, Yu Kang, Chang-Yu Hsieh, Tingjun Hou

    Abstract: Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  31. arXiv:2407.07895  [pdf, other

    cs.CV cs.CL cs.LG

    LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

    Authors: Feng Li, Renrui Zhang, Hao Zhang, Yuanhan Zhang, Bo Li, Wei Li, Zejun Ma, Chunyuan Li

    Abstract: Visual instruction tuning has made considerable strides in enhancing the capabilities of Large Multimodal Models (LMMs). However, existing open LMMs largely focus on single-image tasks, their applications to multi-image scenarios remains less explored. Additionally, prior LMM research separately tackles different scenarios, leaving it impossible to generalize cross scenarios with new emerging capa… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Project Page: https://llava-vl.github.io/blog/2024-06-16-llava-next-interleave/

  32. arXiv:2407.07843  [pdf, other

    quant-ph

    Spin/Phonon Dynamics in Single Molecular Magnets: II. spin/phonon entanglemen

    Authors: Nosheen Younas, Yu Zhang, Andrei Piryatinski, Eric R Bittner

    Abstract: We introduce a new quantum embedding method to explore spin-phonon interactions in molecular magnets. This technique consolidates various spin/phonon couplings into a limited number of collective degrees of freedom, allowing for a fully quantum mechanical treatment. By precisely factorizing the entire system into "system" and "bath" sub-ensembles, our approach simplifies a previously intractable p… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  33. arXiv:2407.07678  [pdf, other

    astro-ph.EP astro-ph.SR

    The ESO SupJup Survey II: The $^{12}$C/$^{13}$C ratios of three young brown dwarfs with CRIRES$^+$

    Authors: D. González Picos, I. A. G. Snellen, S. de Regt, R. Landman, Y. Zhang, S. Gandhi, C. Ginski, A. Y. Kesseli, P. Mollière, T. Stolker

    Abstract: Young brown dwarfs exhibit atmospheric characteristics similar to those of super-Jupiters, providing a unique opportunity to study planetary atmospheres. The ESO SupJup Survey, utilizing CRIRES$^+$ on the Very Large Telescope, aims to assess the role of $^{12}$C/$^{13}$C as a formation tracer. We present observations of three young brown dwarfs: 2MASS J12003792-7845082, TWA 28, and 2MASS J08561384… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted for publication in A&A

  34. arXiv:2407.07660  [pdf, ps, other

    cs.CV cs.AI

    Boosting Medical Image Synthesis via Registration-guided Consistency and Disentanglement Learning

    Authors: Chuanpu Li, Zeli Chen, Yiwen Zhang, Liming Zhong, Wei Yang

    Abstract: Medical image synthesis remains challenging due to misalignment noise during training. Existing methods have attempted to address this challenge by incorporating a registration-guided module. However, these methods tend to overlook the task-specific constraints on the synthetic and registration modules, which may cause the synthetic module to still generate spatially aligned images with misaligned… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  35. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  36. arXiv:2407.07523  [pdf, other

    cs.CV cs.MM

    SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning

    Authors: Haiwen Diao, Bo Wan, Xu Jia, Yunzhi Zhuge, Ying Zhang, Huchuan Lu, Long Chen

    Abstract: Parameter-efficient transfer learning (PETL) has emerged as a flourishing research field for adapting large pre-trained models to downstream tasks, greatly reducing trainable parameters while grappling with memory challenges during fine-tuning. To address it, memory-efficient series (METL) avoid backpropagating gradients through the large backbone. However, they compromise by exclusively relying o… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 23 pages, 11 figures, Accepted by ECCV2024

  37. arXiv:2407.07332  [pdf, ps, other

    cs.IT

    Several new classes of optimal ternary cyclic codes with two or three zeros

    Authors: Gaofei Wu, Zhuohui You, Zhengbang Zha, Yuqing Zhang

    Abstract: Cyclic codes are a subclass of linear codes and have wide applications in data storage systems, communication systems and consumer electronics due to their efficient encoding and decoding algorithms. Let $α$ be a generator of $\mathbb{F}_{3^m}^*$, where $m$ is a positive integer. Denote by $\mathcal{C}_{(i_1,i_2,\cdots, i_t)}$ the cyclic code with generator polynomial… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 16 pages

  38. arXiv:2407.07302  [pdf, other

    eess.IV cs.CV

    Pairwise Distance Distillation for Unsupervised Real-World Image Super-Resolution

    Authors: Yuehan Zhang, Seungjun Lee, Angela Yao

    Abstract: Standard single-image super-resolution creates paired training data from high-resolution images through fixed downsampling kernels. However, real-world super-resolution (RWSR) faces unknown degradations in the low-resolution inputs, all the while lacking paired training data. Existing methods approach this problem by learning blind general models through complex synthetic augmentations on training… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  39. arXiv:2407.07233  [pdf, other

    astro-ph.SR astro-ph.GA

    A kinematical study of the launching region of the blueshifted HH 46/47 outflow with SINFONI K-band observations

    Authors: M. Birney, C. Dougados, E. T. Whelan, B. Nisini, S. Cabrit, Y. Zhang

    Abstract: Studying outflows is important as they may significantly contribute to angular momentum removal from the star/disk system, affecting disk evolution and planet formation. To investigate the different outflow components; the collimated jet, wide-angled molecular outflow, and outflow cavity, of the Class I HH 46/47 outflow system. We focus on their kinematics. We present Near Infrared (NIR) K-band in… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 20 pages, 12 figures, 2 tables, submitted to A&A

  40. arXiv:2407.07099  [pdf, other

    cs.CL cs.AI cs.GT cs.LG

    Nash CoT: Multi-Path Inference with Preference Equilibrium

    Authors: Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang

    Abstract: Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs) on complex problems. Among CoT-related studies, self-consistency (Multi-path inference with answer filtering through voting) involves generating multiple reasoning paths using the CoT framework and then selecting the most frequently produced outputs standing… ▽ More

    Submitted 18 June, 2024; originally announced July 2024.

  41. arXiv:2407.07035  [pdf, other

    cs.CL cs.CV

    Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

    Authors: Yue Zhang, Ziqiao Ma, Jialu Li, Yanyuan Qiao, Zun Wang, Joyce Chai, Qi Wu, Mohit Bansal, Parisa Kordjamshidi

    Abstract: Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development. The remarkable achievements of foundation models have shaped the challenges and proposed methods for VLN research. In this survey, we provide a top-down review that adopts a principled framework for embodied planning and reasoning, and emphasizes the… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Authors contributed equally to this work, and supervisors contributed equal advising to this work

  42. arXiv:2407.07017  [pdf, other

    gr-qc hep-th

    Shadows, greybody factors, emission rate, topological charge, and phase transitions for a charged black hole with a Kalb-Ramond field background

    Authors: F. Hosseinifar, A. A. Araújo Filho, M. Y. Zhang, H. Chen, H. Hassanabadi

    Abstract: In this work, we investigate a spherically symmetric charged black hole in the presence of a Kalb--Ramond field background. We calculate the photon sphere and shadow radii and, corroborating our results, we constrain them from observational data from the Event Horizon Telescope (EHT), particularly focusing on the shadow images of Sagittarius $A^{*}$. Additionally, we analyze the greybody factors,… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 7 pages in two column, 10 figures and 2 tables

  43. arXiv:2407.06975  [pdf

    cond-mat.mtrl-sci

    Optimization of noncollinear magnetic ordering temperature in Y-type hexaferrite by machine learning

    Authors: Yonghong Li, **g Zhang, Linfeng Jiang, Long Zhang, Yugang Zhang, Xueliang Wu, Yisheng Chai, Xiaoyuan Zhou, Zizhen Zhou

    Abstract: Searching the optimal do** compositions of the Y-type hexaferrite Ba2Mg2Fe12O22 remains a long-standing challenge for enhanced non-collinear magnetic transition temperature (TNC). Instead of the conventional trial-and-error approach, the composition-property descriptor is established via a data driven machine learning method named SISSO (sure independence screening and sparsifying operator). Bas… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: accepted by Applied Physics Letters in 2024

  44. arXiv:2407.06955  [pdf, other

    cs.CR cs.CL

    ICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization

    Authors: Wai Man Si, Michael Backes, Yang Zhang

    Abstract: In-context learning (ICL) is a recent advancement in the capabilities of large language models (LLMs). This feature allows users to perform a new task without updating the model. Concretely, users can address tasks during the inference time by conditioning on a few input-label pair demonstrations along with the test input. It is different than the conventional fine-tuning paradigm and offers more… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  45. arXiv:2407.06794  [pdf, other

    cs.CV

    ERQ: Error Reduction for Post-Training Quantization of Vision Transformers

    Authors: Yunshan Zhong, Jiawei Hu, You Huang, Yuxin Zhang, Rongrong Ji

    Abstract: Post-training quantization (PTQ) for vision transformers (ViTs) has garnered significant attention due to its efficiency in compressing models. However, existing methods typically overlook the intricate interdependence between quantized weight and activation, leading to considerable quantization error. In this paper, we propose ERQ, a two-step PTQ approach meticulously crafted to sequentially redu… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: ICML2024 (Spotlight)

  46. arXiv:2407.06691  [pdf, other

    cs.IT eess.SP

    OFDM Achieves the Lowest Ranging Sidelobe Under Random ISAC Signaling

    Authors: Fan Liu, Ying Zhang, Yifeng Xiong, Shuangyang Li, Weijie Yuan, Feifei Gao, Shi **, Giuseppe Caire

    Abstract: This paper aims to answer a fundamental question in the area of Integrated Sensing and Communications (ISAC): What is the optimal communication-centric ISAC waveform for ranging? Towards that end, we first established a generic framework to analyze the sensing performance of communication-centric ISAC waveforms built upon orthonormal signaling bases and random data symbols. Then, we evaluated thei… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 14 pages, 12 figures, submitted to IEEE for possible publication

  47. arXiv:2407.06611  [pdf, other

    cs.CV cs.AI

    CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding

    Authors: Wenhao Xu, Wenming Weng, Yueyi Zhang, Zhiwei Xiong

    Abstract: We present CEIA, an effective framework for open-world event-based understanding. Currently training a large event-text model still poses a huge challenge due to the shortage of paired event-text data. In response to this challenge, CEIA learns to align event and image data as an alternative instead of directly aligning event and text data. Specifically, we leverage the rich event-image datasets t… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  48. arXiv:2407.06590  [pdf, other

    cs.RO cs.AI

    Revolutionizing Battery Disassembly: The Design and Implementation of a Battery Disassembly Autonomous Mobile Manipulator Robot(BEAM-1)

    Authors: Yanlong Peng, Zhigang Wang, Yisheng Zhang, Shengmin Zhang, Nan Cai, Fan Wu, Ming Chen

    Abstract: The efficient disassembly of end-of-life electric vehicle batteries(EOL-EVBs) is crucial for green manufacturing and sustainable development. The current pre-programmed disassembly conducted by the Autonomous Mobile Manipulator Robot(AMMR) struggles to meet the disassembly requirements in dynamic environments, complex scenarios, and unstructured processes. In this paper, we propose a Battery Disas… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  49. arXiv:2407.06565  [pdf, ps, other

    math.AP

    Non-uniqueness of Leray weak solutions of the forced MHD equations

    Authors: Jun Wang, Fei Xu, Yong Zhang

    Abstract: In this paper, we exhibit non-uniqueness of Leray weak solutions of the forced magnetohydrodynamic (MHD for short) equations. Similar to the solutions constructed in \cite{ABC2}, we first find a special steady solution of ideal MHD equations whose linear unstability was proved in \cite{Lin}. It is possible to perturb the unstable scenario of ideal MHD to 3D viscous and resistive MHD equations, whi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 23pp. arXiv admin note: text overlap with arXiv:2310.10075, arXiv:2112.03116 by other authors

  50. arXiv:2407.06504  [pdf, other

    cs.CV

    Reprogramming Distillation for Medical Foundation Models

    Authors: Yuhang Zhou, Siyuan Du, Haolin Li, Jiangchao Yao, Ya Zhang, Yanfeng Wang

    Abstract: Medical foundation models pre-trained on large-scale datasets have demonstrated powerful versatile capabilities for various tasks. However, due to the gap between pre-training tasks (or modalities) and downstream tasks (or modalities), the real-world computation and speed constraints, it might not be straightforward to apply medical foundation models in the downstream scenarios. Previous methods,… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024