Skip to main content

Showing 101–150 of 10,323 results for author: Zhang, Z

.
  1. arXiv:2406.10950  [pdf, other

    cs.CL

    E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models

    Authors: Zhenyu Zhang, Bingguang Hao, **peng Li, Zekai Zhang, Dongyan Zhao

    Abstract: Most large language models (LLMs) are sensitive to prompts, and another synonymous expression or a typo may lead to unexpected results for the model. Composing an optimal prompt for a specific demand lacks theoretical support and relies entirely on human experimentation, which poses a considerable obstacle to popularizing generative artificial intelligence. However, there is no systematic analysis… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2406.10898  [pdf, other

    cs.RO cs.CV

    TrafficBots V1.5: Traffic Simulation via Conditional VAEs and Transformers with Relative Pose Encoding

    Authors: Zhejun Zhang, Christos Sakaridis, Luc Van Gool

    Abstract: In this technical report we present TrafficBots V1.5, a baseline method for the closed-loop simulation of traffic agents. TrafficBots V1.5 achieves baseline-level performance and a 3rd place ranking in the Waymo Open Sim Agents Challenge (WOSAC) 2024. It is a simple baseline that combines TrafficBots, a CVAE-based multi-agent policy conditioned on each agent's individual destination and personalit… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: A Technical Report for Waymo Open Sim Agents Challenge and CVPR 2024 Workshop on Autonomous Driving

  3. arXiv:2406.10740  [pdf, other

    cs.CV

    FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models

    Authors: Zhikai Zhang, Yitang Li, Haofeng Huang, Mingxian Lin, Li Yi

    Abstract: Human motion synthesis is a fundamental task in computer animation. Despite recent progress in this field utilizing deep learning and motion capture data, existing methods are always limited to specific motion categories, environments, and styles. This poor generalizability can be partially attributed to the difficulty and expense of collecting large-scale and high-quality motion data. At the same… ▽ More

    Submitted 21 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  4. arXiv:2406.10700  [pdf, other

    cs.CV cs.RO

    Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection

    Authors: Guowen Zhang, Lue Fan, Chenhang He, Zhen Lei, Zhaoxiang Zhang, Lei Zhang

    Abstract: Serialization-based methods, which serialize the 3D voxels and group them into multiple sequences before inputting to Transformers, have demonstrated their effectiveness in 3D object detection. However, serializing 3D voxels into 1D sequences will inevitably sacrifice the voxel spatial proximity. Such an issue is hard to be addressed by enlarging the group size with existing serialization-based me… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures

  5. arXiv:2406.10584  [pdf, other

    cs.CL

    Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models

    Authors: Chengzhengxu Li, Xiaoming Liu, Zhaohan Zhang, Yichen Wang, Chen Liu, Yu Lan, Chao Shen

    Abstract: Recent advances in prompt optimization have notably enhanced the performance of pre-trained language models (PLMs) on downstream tasks. However, the potential of optimized prompts on domain generalization has been under-explored. To explore the nature of prompt generalization on unknown domains, we conduct pilot experiments and find that (i) Prompts gaining more attention weight from PLMs' deep la… ▽ More

    Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024, Preprint, Under review

  6. arXiv:2406.10524  [pdf, other

    math.NA

    A simple and fast finite difference method for the integral fractional Laplacian of variable order

    Authors: Zhaopeng Hao, Siyuan Shi, Zhongqiang Zhang, Rui Du

    Abstract: For the fractional Laplacian of variable order, an efficient and accurate numerical evaluation in multi-dimension is a challenge for the nature of a singular integral. We propose a simple and easy-to-implement finite difference scheme for the multi-dimensional variable-order fractional Laplacian defined by a hypersingular integral. We prove that the scheme is of second-order convergence and apply… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Submit to Siam Journal on Scientific Computing on Jan. 2024

    MSC Class: 65N06; 65N12; 65T50; 35R11; 26A33

  7. arXiv:2406.10498  [pdf, other

    cs.LG cs.SI

    A Unified Graph Selective Prompt Learning for Graph Neural Networks

    Authors: Bo Jiang, Hao Wu, Ziyan Zhang, Beibei Wang, ** Tang

    Abstract: In recent years, graph prompt learning/tuning has garnered increasing attention in adapting pre-trained models for graph representation learning. As a kind of universal graph prompt learning method, Graph Prompt Feature (GPF) has achieved remarkable success in adapting pre-trained models for Graph Neural Networks (GNNs). By fixing the parameters of a pre-trained GNN model, the aim of GPF is to mod… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  8. Learning Flexible Time-windowed Granger Causality Integrating Heterogeneous Interventional Time Series Data

    Authors: Ziyi Zhang, Shaogang Ren, Xiaoning Qian, Nick Duffield

    Abstract: Granger causality, commonly used for inferring causal structures from time series data, has been adopted in widespread applications across various fields due to its intuitive explainability and high compatibility with emerging deep neural network prediction models. To alleviate challenges in better deciphering causal structures unambiguously from time series, the use of interventional data has bec… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by ACM SIGKDD 2024

  9. arXiv:2406.10416  [pdf, other

    cs.CR cs.DC cs.LG

    Byzantine-Robust Decentralized Federated Learning

    Authors: Minghong Fang, Zifan Zhang, Hairi, Prashant Khanduri, Jia Liu, Songtao Lu, Yuchen Liu, Neil Gong

    Abstract: Federated learning (FL) enables multiple clients to collaboratively train machine learning models without revealing their private training data. In conventional FL, the system follows the server-assisted architecture (server-assisted FL), where the training process is coordinated by a central server. However, the server-assisted FL framework suffers from poor scalability due to a communication bot… ▽ More

    Submitted 20 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: To appear in ACM Conference on Computer and Communications Security 2024 (CCS '24)

  10. arXiv:2406.10310  [pdf, other

    cs.CL cs.AI

    TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs

    Authors: Zhuofeng Li, Zixing Gou, Xiangnan Zhang, Zhongyuan Liu, Sirui Li, Yuntong Hu, Chen Ling, Zheng Zhang, Liang Zhao

    Abstract: Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections across various real-world settings. However, existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. This lack of rich textual edge annotat… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  11. arXiv:2406.10282  [pdf, other

    cs.CR

    Hardware-based stack buffer overflow attack detection on RISC-V architectures

    Authors: Cristiano Pegoraro Chenet, Ziteng Zhang, Alessandro Savino, Stefano Di Carlo

    Abstract: This work evaluates how well hardware-based approaches detect stack buffer overflow (SBO) attacks in RISC-V systems. We conducted simulations on the PULP platform and examined micro-architecture events using semi-supervised anomaly detection techniques. The findings showed the challenge of detection performance. Thus, a potential solution combines software and hardware-based detectors concurrently… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 2 pages, 2 figures

  12. arXiv:2406.10248  [pdf, other

    cs.CL cs.AI

    On the Worst Prompt Performance of Large Language Models

    Authors: Bowen Cao, Deng Cai, Zhisong Zhang, Yuexian Zou, Wai Lam

    Abstract: The performance of large language models (LLMs) is acutely sensitive to the phrasing of prompts, which raises significant concerns about their reliability in real-world scenarios. Existing studies often divide prompts into task-level instructions and case-level inputs and primarily focus on evaluating and improving robustness against variations in tasks-level instructions. However, this setup fail… ▽ More

    Submitted 21 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  13. TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subty** based on EHR Data

    Authors: Ziyang Zhang, Hejie Cui, Ran Xu, Yuzhang Xie, Joyce C. Ho, Carl Yang

    Abstract: The growing availability of well-organized Electronic Health Records (EHR) data has enabled the development of various machine learning models towards disease risk prediction. However, existing risk prediction methods overlook the heterogeneity of complex diseases, failing to model the potential disease subtypes regarding their corresponding patient visits and clinical concept subgroups. In this w… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures, to be published in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

  14. arXiv:2406.09846  [pdf, ps, other

    cs.IT eess.SP

    Multiple Intelligent Reflecting Surfaces Collaborative Wireless Localization System

    Authors: Ziheng Zhang, Wen Chen, Qingqing Wu, Zhendong Li, Xusheng Zhu, **gfeng Chen, Nan Cheng

    Abstract: This paper studies a multiple intelligent reflecting surfaces (IRSs) collaborative localization system where multiple semi-passive IRSs are deployed in the network to locate one or more targets based on time-of-arrival. It is assumed that each semi-passive IRS is equipped with reflective elements and sensors, which are used to establish the line-of-sight links from the base station (BS) to multipl… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 13 pages, 8 figures

  15. arXiv:2406.09836  [pdf, other

    cs.LG cs.CR

    Robustness-Inspired Defense Against Backdoor Attacks on Graph Neural Networks

    Authors: Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, Suhang Wang

    Abstract: Graph Neural Networks (GNNs) have achieved promising results in tasks such as node classification and graph classification. However, recent studies reveal that GNNs are vulnerable to backdoor attacks, posing a significant threat to their real-world adoption. Despite initial efforts to defend against specific graph backdoor attacks, there is no work on defending against various types of backdoor at… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  16. arXiv:2406.09779  [pdf, other

    cs.AI cs.CL cs.CV

    OSPC: Detecting Harmful Memes with Large Language Model as a Catalyst

    Authors: **gtao Cao, Zheng Zhang, Hongru Wang, Bin Liang, Hao Wang, Kam-Fai Wong

    Abstract: Memes, which rapidly disseminate personal opinions and positions across the internet, also pose significant challenges in propagating social bias and prejudice. This study presents a novel approach to detecting harmful memes, particularly within the multicultural and multilingual context of Singapore. Our methodology integrates image captioning, Optical Character Recognition (OCR), and Large Langu… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  17. arXiv:2406.09612  [pdf, other

    cs.AI cs.LG physics.chem-ph

    Automated Molecular Concept Generation and Labeling with Large Language Models

    Authors: Shichang Zhang, Botao Xia, Zimin Zhang, Qianli Wu, Fang Sun, Ziniu Hu, Yizhou Sun

    Abstract: Artificial intelligence (AI) is significantly transforming scientific research. Explainable AI methods, such as concept-based models (CMs), are promising for driving new scientific discoveries because they make predictions based on meaningful concepts and offer insights into the prediction process. In molecular science, however, explainable CMs are not as common compared to black-box models like G… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  18. arXiv:2406.09611  [pdf, other

    cs.HC

    Recy-ctronics: Designing Fully Recyclable Electronics With Varied Form Factors

    Authors: Tingyu Cheng, Zhihan Zhang, Han Huang, Yingting Gao, Wei Sun, Gregory D. Abowd, HyunJoo Oh, Josiah Hester

    Abstract: For today's electronics manufacturing process, the emphasis on stable functionality, durability, and fixed physical forms is designed to ensure long-term usability. However, this focus on robustness and permanence complicates the disassembly and recycling processes, leading to significant environmental repercussions. In this paper, we present three approaches that leverage easily recyclable materi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  19. arXiv:2406.09552  [pdf, other

    astro-ph.GA

    Molecular gas excitation in the circumgalactic medium of MACS1931-26

    Authors: L. Ghodsi, J. Zhou, P. Andreani, C. De Breuck, A. W. S. Man, Y. Miyamoto, T. G. Bisbas, A. Lundgren, Z. -Y. Zhang

    Abstract: The evolution of galaxies is largely affected by exchanging material with their close environment, the circumgalactic medium (CGM). In this work, we investigate the CGM and the interstellar medium (ISM) of the bright central galaxy (BCG) of the galaxy cluster, MACS1931-26 at z~0.35. We detected [CI](2-1), CO(1-0), and CO(7-6) emission lines with the APEX 12-m and NRO 45-m telescopes. We complement… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 5 figures, accepted for publication by A&A

  20. arXiv:2406.09528  [pdf, other

    astro-ph.EP astro-ph.SR

    JWST/NIRCam 4-5 $μ$m Imaging of the Giant Planet AF Lep b

    Authors: Kyle Franson, William O. Balmer, Brendan P. Bowler, Laurent Pueyo, Yifan Zhou, Emily Rickman, Zhoujian Zhang, Sagnick Mukherjee, Tim D. Pearce, Daniella C. Bardalez Gagliuffi, Lauren I. Biddle, Timothy D. Brandt, Rachel Bowens-Rubin, Justin R. Crepp, James W. Davidson, Jr., Jacqueline Faherty, Christian Ginski, Elliott P. Horch, Marvin Morgan, Caroline V. Morley, Marshall D. Perrin, Aniket Sanghi, Maissa Salama, Christopher A. Theissen, Quang H. Tran , et al. (1 additional authors not shown)

    Abstract: With a dynamical mass of $3 \, M_\mathrm{Jup}$, the recently discovered giant planet AF Lep b is the lowest-mass imaged planet with a direct mass measurement. Its youth and spectral type near the L/T transition make it a promising target to study the impact of clouds and atmospheric chemistry at low surface gravities. In this work, we present JWST/NIRCam imaging of AF Lep b. Across two epochs, we… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 17 pages, 4 figures, submitted to ApJL

  21. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  22. arXiv:2406.09397  [pdf, other

    cs.CV cs.AI

    Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms

    Authors: Miaosen Zhang, Yixuan Wei, Zhen Xing, Yifei Ma, Zuxuan Wu, Ji Li, Zheng Zhang, Qi Dai, Chong Luo, Xin Geng, Baining Guo

    Abstract: Modern vision models are trained on very large noisy datasets. While these models acquire strong capabilities, they may not follow the user's intent to output the desired results in certain aspects, e.g., visual aesthetic, preferred style, and responsibility. In this paper, we target the realm of visual aesthetics and aim to align vision models with human aesthetic standards in a retrieval system.… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 28 pages, 26 figures, under review

  23. arXiv:2406.09389  [pdf, other

    eess.IV cs.CV

    Sagiri: Low Dynamic Range Image Enhancement with Generative Diffusion Prior

    Authors: Baiang Li, Sizhuo Ma, Yanhong Zeng, Xiaogang Xu, Youqing Fang, Zhao Zhang, Jian Wang, Kai Chen

    Abstract: Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas. Traditional LDR image enhancement methods primarily focus on color map**, which enhances the visual representation by expanding the image's color range and adjusting the brightness… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: https://sagiri0208.github.io

  24. arXiv:2406.09356  [pdf, other

    cs.CV eess.IV

    CMC-Bench: Towards a New Paradigm of Visual Signal Compression

    Authors: Chunyi Li, Xiele Wu, Haoning Wu, Donghui Feng, Zicheng Zhang, Guo Lu, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, Weisi Lin

    Abstract: Ultra-low bitrate image compression is a challenging and demanding topic. With the development of Large Multimodal Models (LMMs), a Cross Modality Compression (CMC) paradigm of Image-Text-Image has emerged. Compared with traditional codecs, this semantic-level compression can reduce image data size to 0.1\% or even lower, which has strong potential applications. However, CMC has certain defects in… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  25. arXiv:2406.09196  [pdf, other

    cs.CV cs.LG

    Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

    Authors: Ke Fan, Zechen Bai, Tianjun Xiao, Tong He, Max Horn, Yanwei Fu, Francesco Locatello, Zheng Zhang

    Abstract: Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention, which utilizes attention mechanisms to iteratively refine slot representations. However, a major drawback of most object-centric models, including slot… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: CVPR 2024

  26. arXiv:2406.08959  [pdf, other

    cs.HC cs.AI

    Beyond Recommendations: From Backward to Forward AI Support of Pilots' Decision-Making Process

    Authors: Zelun Tony Zhang, Sebastian S. Feger, Lucas Dullenkopf, Rulu Liao, Lukas Süsslin, Yuanting Liu, Andreas Butz

    Abstract: AI is anticipated to enhance human decision-making in high-stakes domains like aviation, but adoption is often hindered by challenges such as inappropriate reliance and poor alignment with users' decision-making. Recent research suggests that a core underlying issue is the recommendation-centric design of many AI systems, i.e., they give end-to-end recommendations and ignore the rest of the decisi… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to CSCW 2024, to be published in PACM HCI Vol. 8, No. CSCW2

  27. arXiv:2406.08795  [pdf, ps, other

    math.AP

    Hölder Continuity for Fully Fractional Parabolic Equations with Space-time Nonlocal Operators

    Authors: Lingwei Ma, Qi Xiong, Zhenqiu Zhang

    Abstract: We study the local Hölder regularity of weak solutions to the fully fractional parabolic equations involving spatial fractional diffusion and fractional time derivatives of the Marchaud type. It is worth noting that we do not impose boundedness assumptions on the weak solutions and nonhomogeneous terms. Within the space-time nonlocal framework, it is crucial to consider both space-dependent nonloc… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  28. arXiv:2406.08792  [pdf, other

    cond-mat.dis-nn cond-mat.mtrl-sci

    Revealing hidden medium-range order in silicate glass-formers using many-body correlation functions

    Authors: Zhen Zhang, Walter Kob

    Abstract: The medium range order (MRO) in amorphous systems has been linked to complex features such as the dynamic heterogeneity of supercooled liquids or the plastic deformation of glasses. However, the nature of the MRO in these materials has remained elusive, primarily due to the lack of methods capable of characterizing this order. Here, we leverage standard two-body structural correlators and advanced… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  29. arXiv:2406.08771  [pdf, other

    cs.SD cs.AI eess.AS

    MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection

    Authors: Da Mu, Zhicheng Zhang, Haobo Yue

    Abstract: Sound Event Localization and Detection (SELD) involves detecting and localizing sound events using multichannel sound recordings. Previously proposed Event-Independent Network V2 (EINV2) has achieved outstanding performance on SELD. However, it still faces challenges in effectively extracting features across spectral, spatial, and temporal domains. This paper proposes a three-stage network structu… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  30. arXiv:2406.08757  [pdf, other

    cs.CL cs.AI

    SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

    Authors: Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang

    Abstract: Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents,… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Track on Datasets and Benchmarks under review

  31. arXiv:2406.08712  [pdf, other

    physics.ins-det hep-ex

    A Novel Diamond-like Carbon based photocathode for PICOSEC Micromegas detectors

    Authors: X. Wang, R. Aleksan, Y. Angelis, J. Bortfeldt, F. Brunbauer, M. Brunoldi, E. Chatzianagnostou, J. Datta, K. Degmelt, G. Fanourakis, D. Fiorina, K. J. Floethner, M. Gallinaro, F. Garcia, I. Giomataris, K. Gnanvo, F. J. Iguaz, D. Janssens, A. Kallitsopoulou, M. Kovacic, B. Kross, P. Legou, M. Lisowska, J. Liu, I. Maniatis , et al. (26 additional authors not shown)

    Abstract: The PICOSEC Micromegas (MM) detector is a precise timing gaseous detector based on a MM detector operating in a two-stage amplification mode and a Cherenkov radiator. Prototypes equipped with cesium iodide (CsI) photocathodes have shown promising time resolutions as precise as 24 picoseconds (ps) for Minimum Ionizing Particles. However, due to the high hygroscopicity and susceptibility to ion bomb… ▽ More

    Submitted 25 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  32. arXiv:2406.08487  [pdf, other

    cs.CV

    Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

    Authors: Yi-Fan Zhang, Qingsong Wen, Chaoyou Fu, Xue Wang, Zhang Zhang, Liang Wang, Rong **

    Abstract: Seeing clearly with high resolution is a foundation of Large Multimodal Models (LMMs), which has been proven to be vital for visual perception and reasoning. Existing works usually employ a straightforward resolution upscaling method, where the image consists of global and local branches, with the latter being the sliced image patches but resized to the same resolution as the former. This means th… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Project page: https://github.com/yfzhang114/SliME

  33. arXiv:2406.08481  [pdf, other

    cs.CV

    Enhancing End-to-End Autonomous Driving with Latent World Model

    Authors: Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang, Tieniu Tan

    Abstract: End-to-end autonomous driving has garnered widespread attention. Current end-to-end approaches largely rely on the supervision from perception tasks such as detection, tracking, and map segmentation to aid in learning scene representations. However, these methods require extensive annotations, hindering the data scalability. To address this challenge, we propose a novel self-supervised method to e… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  34. HiFAST : An HI Data Calibration and Imaging Pipeline for FAST II. Flux Density Calibration

    Authors: Ziming Liu, Jie Wang, Yingjie **g, Zhi-Yu Zhang, Chen Xu, Tiantian Liang, Qingze Chen, Ningyu Tang, Qingliang Yang

    Abstract: Accurate flux density calibration is essential for precise analysis and interpretation of observations across different observation modes and instruments. In this research, we firstly introduce the flux calibration model incorporated in HIFAST pipeline, designed for processing HI 21-cm spectra. Furthermore, we investigate different calibration techniques and assess the dependence of the gain param… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 14 pages, 15 figures, accepted by RAA

  35. arXiv:2406.08225  [pdf, ps, other

    hep-ex

    Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

    Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  36. arXiv:2406.08124  [pdf, other

    cs.CL cs.AI

    Legend: Leveraging Representation Engineering to Annotate Safety Margin for Preference Datasets

    Authors: Duanyu Feng, Bowen Qin, Chen Huang, Youcheng Huang, Zheng Zhang, Wenqiang Lei

    Abstract: The success of the reward model in distinguishing between responses with subtle safety differences depends critically on the high-quality preference dataset, which should capture the fine-grained nuances of harmful and harmless responses. This motivates the need to develop a dataset involving preference margins, which accurately quantify how harmless one response is compared to another. In this pa… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Our code is available at https://github.com/colfeng/Legend

  37. arXiv:2406.08115  [pdf, other

    cs.DC cs.AI

    Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

    Authors: Feng Liang, Zhen Zhang, Haifeng Lu, Chengming Li, Victor C. M. Leung, Yanyi Guo, Xi** Hu

    Abstract: With rapidly increasing distributed deep learning workloads in large-scale data centers, efficient distributed deep learning framework strategies for resource allocation and workload scheduling have become the key to high-performance deep learning. The large-scale environment with large volumes of datasets, models, and computational and communication resources raises various unique challenges for… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  38. arXiv:2406.08090  [pdf, other

    cs.CV

    From Sim-to-Real: Toward General Event-based Low-light Frame Interpolation with Per-scene Optimization

    Authors: Ziran Zhang, Yongrui Ma, Yueting Chen, Feng Zhang, **wei Gu, Tianfan Xue, Shi Guo

    Abstract: Video Frame Interpolation (VFI) is important for video enhancement, frame rate up-conversion, and slow-motion generation. The introduction of event cameras, which capture per-pixel brightness changes asynchronously, has significantly enhanced VFI capabilities, particularly for high-speed, nonlinear motions. However, these event-based methods encounter challenges in low-light conditions, notably tr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  39. arXiv:2406.07984  [pdf, ps, other

    math.AP

    On the density patch problem for the 2-D inhomogeneous Navier-Stokes equations

    Authors: Tiantian Hao, Feng Shao, Dongyi Wei, Zhifei Zhang

    Abstract: In this paper, we first construct a class of global strong solutions for the 2-D inhomogeneous Navier-Stokes equations under very general assumption that the initial density is only bounded and the initial velocity is in $H^1(\mathbb{R}^2)$. With suitable assumptions on the initial density, which includes the case of density patch and vacuum bubbles, we prove that Lions' s weak solution is the sam… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 23 pages

  40. arXiv:2406.07918  [pdf, other

    eess.IV

    Micro-expression recognition based on depth map to point cloud

    Authors: Ren Zhang, Jianqin Yin, Chao Qi, Zehao Wang, Zhicheng Zhang, Yonghao Dang

    Abstract: Micro-expressions are nonverbal facial expressions that reveal the covert emotions of individuals, making the micro-expression recognition task receive widespread attention. However, the micro-expression recognition task is challenging due to the subtle facial motion and brevity in duration. Many 2D image-based methods have been developed in recent years to recognize MEs effectively, but, these ap… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  41. arXiv:2406.07601  [pdf, other

    astro-ph.HE hep-ex

    IceCube Search for Neutrino Emission from X-ray Bright Seyfert Galaxies

    Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (400 additional authors not shown)

    Abstract: The recent IceCube detection of TeV neutrino emission from the nearby active galaxy NGC 1068 suggests that active galactic nuclei (AGN) could make a sizable contribution to the diffuse flux of astrophysical neutrinos. The absence of TeV $γ$-rays from NGC 1068 indicates neutrino production in the vicinity of the supermassive black hole, where the high radiation density leads to $γ$-ray attenuation.… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 17 pages, 9 figures

  42. arXiv:2406.07499  [pdf, other

    cs.CV cs.GR

    Trim 3D Gaussian Splatting for Accurate Geometry Representation

    Authors: Lue Fan, Yuxue Yang, Minxing Li, Hongsheng Li, Zhaoxiang Zhang

    Abstract: In this paper, we introduce Trim 3D Gaussian Splatting (TrimGS) to reconstruct accurate 3D geometry from images. Previous arts for geometry reconstruction from 3D Gaussians mainly focus on exploring strong geometry regularization. Instead, from a fresh perspective, we propose to obtain accurate 3D geometry of a scene by Gaussian trimming, which selectively removes the inaccurate geometry while pre… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Project page: https://trimgs.github.io/

  43. arXiv:2406.07362  [pdf, other

    cs.HC

    AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database

    Authors: Wanling Gao, Yuan Liu, Zhuoming Yu, Dandan Cui, Wen**g Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Fan Huang, Gangyuan Zhao, Chongrong Jiang, Tianyi Wei, Zhifei Zhang, Yunyou Huang, Jianfeng Zhan

    Abstract: Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI f… ▽ More

    Submitted 15 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 12 pages

  44. arXiv:2406.07306  [pdf, other

    physics.med-ph

    A directional total variation minimization algorithm for isotropic resolution in digital breast tomosynthesis

    Authors: Emil Y. Sidky, Xiangyi Wu, Xiaoyu Duan, Hailing Huang, Wei Zhao, Leo Y. Zhang, John Paul Phillips, Zheng Zhang, Buxin Chen, Dan Xia, Ingrid S. Reiser, Xiaochuan Pan

    Abstract: An optimization-based image reconstruction algorithm is developed for contrast enhanced digital breast tomosynthesis (DBT) using dual-energy scanning. The algorithm minimizes directional total variation (TV) with a data discrepancy and non-negativity constraints. Iodinated contrast agent (ICA) imaging is performed by reconstructing images from dual-energy DBT data followed by weighted subtraction.… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Proceedings paper for accepted contribution to the 8th International Conference on Image Formation in X-Ray Computed Tomography (https://www.ct-meeting.org)

  45. arXiv:2406.07238  [pdf

    cond-mat.supr-con

    Structures and Superconductivity of Hydrogen and Hydrides under Extreme Pressure

    Authors: Zihan Zhang, Wendi Zhao, Defang Duan, Tian Cui

    Abstract: Metallic hydrogen, existing in remarkably extreme environments, was predicted to exhibit long-sought room-temperature superconductivity. Although the superconductivity of metallic hydrogen has not been confirmed experimentally, superconductivity of hydrogen in hydrides was recently discovered with remarkably high critical temperature as theoretically predicted. In recent years, theoretical simulat… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 35 pages, 9 figures

  46. arXiv:2406.07021  [pdf, other

    cs.SE

    A Tool for Test Case Scenarios Generation Using Large Language Models

    Authors: Abdul Malik Sami, Zeeshan Rasheed, Muhammad Waseem, Zheying Zhang, Herda Tomas, Pekka Abrahamsson

    Abstract: Large Language Models (LLMs) are widely used in Software Engineering (SE) for various tasks, including generating code, designing and documenting software, adding code comments, reviewing code, and writing test scripts. However, creating test scripts or automating test cases demands test suite documentation that comprehensively covers functional requirements. Such documentation must enable thoroug… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 6 pages, 2 figures, and 1 table

  47. arXiv:2406.06829  [pdf, other

    cs.LG stat.ML

    Personalized Binomial DAGs Learning with Network Structured Covariates

    Authors: Boxin Zhao, Weishi Wang, Dingyuan Zhu, Ziqi Liu, Dong Wang, Zhiqiang Zhang, Jun Zhou, Mladen Kolar

    Abstract: The causal dependence in data is often characterized by Directed Acyclic Graphical (DAG) models, widely used in many areas. Causal discovery aims to recover the DAG structure using observational data. This paper focuses on causal discovery with multi-variate count data. We are motivated by real-world web visit data, recording individual user visits to multiple websites. Building a causal diagram c… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  48. arXiv:2406.06684  [pdf, other

    astro-ph.HE

    Search for neutrino emission from hard X-ray AGN with IceCube

    Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (401 additional authors not shown)

    Abstract: Active Galactic Nuclei (AGN) are promising candidate sources of high-energy astrophysical neutrinos since they provide environments rich in matter and photon targets where cosmic ray interactions may lead to the production of gamma rays and neutrinos. We searched for high-energy neutrino emission from AGN using the $\textit{Swift}$-BAT Spectroscopic Survey (BASS) catalog of hard X-ray sources and… ▽ More

    Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  49. arXiv:2406.06626  [pdf, other

    cs.LG cs.AI cs.HC eess.SP

    Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications

    Authors: Zhou Zhou, Guohang He, Zheng Zhang, Luziwei Leng, Qinghai Guo, Jianxing Liao, Xuan Song, Ran Cheng

    Abstract: Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  50. arXiv:2406.06593  [pdf, other

    cs.LG cs.AI cs.AR

    Differentiable Combinatorial Scheduling at Scale

    Authors: Mingju Liu, Yingjie Li, Jiaqi Yin, Zhiru Zhang, Cunxi Yu

    Abstract: This paper addresses the complex issue of resource-constrained scheduling, an NP-hard problem that spans critical areas including chip design and high-performance computing. Traditional scheduling methods often stumble over scalability and applicability challenges. We propose a novel approach using a differentiable combinatorial scheduling framework, utilizing Gumbel-Softmax differentiable samplin… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 13 pages; International Conference on Machine Learning (ICML'24)