Skip to main content

Showing 1–50 of 3,843 results for author: wang, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03107  [pdf

    cs.HC cs.GR cs.MM

    Design of a UE5-based digital twin platform

    Authors: Shaoqiu Lyu, Muzhi Wang, Sunrui Zhang, Shengzhi Wang

    Abstract: Aiming at the current mainstream 3D scene engine learning and building cost is too high, this thesis proposes a digital twin platform design program based on Unreal Engine 5 (UE5). It aims to provide a universal platform construction design process to effectively reduce the learning cost of large-scale scene construction. Taking an actual project of a unit as an example, the overall cycle work of… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2407.03089  [pdf, other

    eess.SP cs.LG q-bio.NC

    Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis

    Authors: Tong Zhou, Shuqiang Wang

    Abstract: Electroencephalogram (EEG) technology, particularly high-density EEG (HD EEG) devices, is widely used in fields such as neuroscience. HD EEG devices improve the spatial resolution of EEG by placing more electrodes on the scalp, meeting the requirements of clinical diagnostic applications such as epilepsy focus localization. However, this technique faces challenges such as high acquisition costs an… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2407.02408  [pdf, other

    cs.CL cs.LG

    CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

    Authors: Song Wang, Peng Wang, Tong Zhou, Yushun Dong, Zhen Tan, Jundong Li

    Abstract: As Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks, concerns regarding the potential negative societal impacts of LLM-generated content have also arisen. To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets. However, existing bias evaluation efforts often focus on only a particular type o… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 37 pages, 32 figures

  4. arXiv:2407.02392  [pdf, other

    cs.CV

    TokenPacker: Efficient Visual Projector for Multimodal LLM

    Authors: Wentong Li, Yuqian Yuan, Jian Liu, Dongqi Tang, Song Wang, Jianke Zhu, Lei Zhang

    Abstract: The visual projector serves as an essential bridge between the visual encoder and the Large Language Model (LLM) in a Multimodal LLM (MLLM). Typically, MLLMs adopt a simple MLP to preserve all visual contexts via one-to-one transformation. However, the visual tokens are redundant and can be considerably increased when dealing with high-resolution images, impairing the efficiency of MLLMs significa… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 16 pages, Codes:https://github.com/CircleRadon/TokenPacker

  5. arXiv:2407.02243  [pdf, other

    cs.CL cs.SD eess.AS

    Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

    Authors: Yuchen Hu, Chen Chen, Siyin Wang, Eng Siong Chng, Chao Zhang

    Abstract: In this paper, we propose reverse inference optimization (RIO), a simple and effective method designed to enhance the robustness of autoregressive-model-based zero-shot text-to-speech (TTS) systems using reinforcement learning from human feedback (RLHF). To assess the quality of speech produced by the TTS system without human annotations, RIO introduces a novel concept termed as reverse inference… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 12 pages, Work in progress

  6. arXiv:2407.02119  [pdf, other

    cs.LG cs.AI cs.CL

    Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning

    Authors: Yifang Chen, Shuohang Wang, Ziyi Yang, Hiteshi Sharma, Nikos Karampatziakis, Donghan Yu, Kevin Jamieson, Simon Shaolei Du, Yelong Shen

    Abstract: Reinforcement learning with human feedback (RLHF), as a widely adopted approach in current large language model pipelines, is \textit{bottlenecked by the size of human preference data}. While traditional methods rely on offline preference dataset constructions, recent approaches have shifted towards online settings, where a learner uses a small amount of labeled seed data and a large pool of unlab… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  7. arXiv:2407.01534  [pdf, other

    cs.NI

    AIGC-Assisted Digital Watermark Services in Low-Earth Orbit Satellite-Terrestrial Edge Networks

    Authors: Kongyang Chen, Yikai Li, Wenjun Lan, Bing Mi, Shaowei Wang

    Abstract: Low Earth Orbit (LEO) satellite communication is a crucial component of future 6G communication networks, contributing to the development of an integrated satellite-terrestrial network. In the forthcoming satellite-to-ground network, the idle computational resources of LEO satellites can serve as edge servers, delivering intelligent task computation services to ground users. Existing research on s… ▽ More

    Submitted 8 March, 2024; originally announced July 2024.

  8. arXiv:2407.01330  [pdf, other

    cs.CV

    Learning Unsigned Distance Fields from Local Shape Functions for 3D Surface Reconstruction

    Authors: Jiangbei Hu, Yanggeng Li, Fei Hou, Junhui Hou, Zhebin Zhang, Shengfa Wang, Na Lei, Ying He

    Abstract: Unsigned distance fields (UDFs) provide a versatile framework for representing a diverse array of 3D shapes, encompassing both watertight and non-watertight geometries. Traditional UDF learning methods typically require extensive training on large datasets of 3D shapes, which is costly and often necessitates hyperparameter adjustments for new datasets. This paper presents a novel neural framework,… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figures

    ACM Class: I.3.5

  9. arXiv:2407.01260  [pdf, other

    cs.CR

    DeepiSign-G: Generic Watermark to Stamp Hidden DNN Parameters for Self-contained Tracking

    Authors: Alsharif Abuadbba, Nicholas Rhodes, Kristen Moore, Bushra Sabir, Shuo Wang, Yansong Gao

    Abstract: Deep learning solutions in critical domains like autonomous vehicles, facial recognition, and sentiment analysis require caution due to the severe consequences of errors. Research shows these models are vulnerable to adversarial attacks, such as data poisoning and neural trojaning, which can covertly manipulate model behavior, compromising reliability and safety. Current defense strategies like wa… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 13 pages

  10. arXiv:2407.01103  [pdf, other

    cs.RO

    FedRC: A Rapid-Converged Hierarchical Federated Learning Framework in Street Scene Semantic Understanding

    Authors: Wei-Bin Kou, Qingfeng Lin, Ming Tang, Shuai Wang, Guangxu Zhu, Yik-Chung Wu

    Abstract: Street Scene Semantic Understanding (denoted as TriSU) is a crucial but complex task for world-wide distributed autonomous driving (AD) vehicles (e.g., Tesla). Its inference model faces poor generalization issue due to inter-city domain-shift. Hierarchical Federated Learning (HFL) offers a potential solution for improving TriSU model generalization, but suffers from slow convergence rate because o… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This work has been accepted by 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  11. arXiv:2407.01102  [pdf, other

    cs.CL cs.IR

    BERGEN: A Benchmarking Library for Retrieval-Augmented Generation

    Authors: David Rau, Hervé Déjean, Nadezhda Chirkova, Thibault Formal, Shuai Wang, Vassilina Nikoulina, Stéphane Clinchant

    Abstract: Retrieval-Augmented Generation allows to enhance Large Language Models with external knowledge. In response to the recent popularity of generative LLMs, many RAG approaches have been proposed, which involve an intricate number of different configurations such as evaluation datasets, collections, metrics, retrievers, and LLMs. Inconsistent benchmarking poses a major challenge in comparing approache… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 29 pages

  12. arXiv:2407.01067  [pdf, other

    cs.AI cs.CL cs.CV cs.HC cs.LG

    Human-like object concept representations emerge naturally in multimodal large language models

    Authors: Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, **peng Li, Shuang Qiu, Le Chang, Huiguang He

    Abstract: The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vas… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  13. arXiv:2407.00611  [pdf, other

    cs.DC

    WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem

    Authors: Ziming Liu, Shaoyu Wang, Shenggan Cheng, Zhongkai Zhao, Xuanlei Zhao, James Demmel, Yang You

    Abstract: In recent years, Transformer-based Large Language Models (LLMs) have garnered significant attention due to their exceptional performance across a variety of tasks. However, training these models on long sequences presents a substantial challenge in terms of efficiency and scalability. Current methods are constrained either by the number of attention heads, limiting scalability, or by excessive com… ▽ More

    Submitted 1 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

  14. arXiv:2407.00131  [pdf, other

    cs.LG cs.AI cs.CV

    RepAct: The Re-parameterizable Adaptive Activation Function

    Authors: Xian Wu, Qingchuan Tao, Shuang Wang

    Abstract: Addressing the imperative need for efficient artificial intelligence in IoT and edge computing, this study presents RepAct, a re-parameterizable adaptive activation function tailored for optimizing lightweight neural networks within the computational limitations of edge devices. By employing a multi-branch structure with learnable adaptive weights, RepAct enriches feature processing and enhances c… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  15. arXiv:2407.00128  [pdf, other

    cs.IR cs.AI cs.LG

    When Search Engine Services meet Large Language Models: Visions and Challenges

    Authors: Haoyi Xiong, Jiang Bian, Yuchen Li, Xuhong Li, Mengnan Du, Shuaiqiang Wang, Dawei Yin, Sumi Helal

    Abstract: Combining Large Language Models (LLMs) with search engine services marks a significant shift in the field of services computing, opening up new possibilities to enhance how we search for and retrieve information, understand content, and interact with internet services. This paper conducts an in-depth examination of how integrating LLMs with search engines can mutually benefit both technologies. We… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Under Review

  16. arXiv:2407.00056  [pdf, other

    cs.IR cs.AI cs.SI

    MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion

    Authors: Jiaxin Deng, Shiyao Wang, Yuchen Wang, Jiansong Qi, Liqin Zhao, Guorui Zhou, Gaofeng Meng

    Abstract: Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a… ▽ More

    Submitted 15 June, 2024; originally announced July 2024.

    Comments: Accepted at KDD 2024

  17. arXiv:2406.19749  [pdf, other

    eess.IV cs.CV

    SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation

    Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Automatic vessel segmentation is paramount for develo** next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  18. arXiv:2406.19500  [pdf, other

    cs.AI cs.CL

    Knowledge acquisition for dialogue agents using reinforcement learning on graph representations

    Authors: Selene Baez Santamaria, Shihan Wang, Piek Vossen

    Abstract: We develop an artificial agent motivated to augment its knowledge base beyond its initial training. The agent actively participates in dialogues with other agents, strategically acquiring new information. The agent models its knowledge as an RDF knowledge graph, integrating new beliefs acquired through conversation. Responses in dialogue are generated by identifying graph patterns around these new… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  19. arXiv:2406.19417  [pdf, other

    cs.CR cs.AI

    "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models

    Authors: Zhen Tan, Chengshuai Zhao, Raha Moraffah, Yifan Li, Song Wang, Jundong Li, Tianlong Chen, Huan Liu

    Abstract: Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) by integrating external knowledge bases, improving their performance in applications like fact-checking and information searching. In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases by injecting deceptive content into the retrieval database, intentionall… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Preprint

  20. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  21. arXiv:2406.18950  [pdf, other

    eess.IV cs.CV

    MMR-Mamba: Multi-Contrast MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion

    Authors: **g Zou, Lanqing Liu, Qi Chen, Shujun Wang, Xiaohan Xing, **g Qin

    Abstract: Multi-contrast MRI acceleration has become prevalent in MR imaging, enabling the reconstruction of high-quality MR images from under-sampled k-space data of the target modality, using guidance from a fully-sampled auxiliary modality. The main crux lies in efficiently and comprehensively integrating complementary information from the auxiliary modality. Existing methods either suffer from quadratic… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figure

  22. arXiv:2406.18783  [pdf, other

    cs.CL cs.LG

    Psychological Profiling in Cybersecurity: A Look at LLMs and Psycholinguistic Features

    Authors: Jean Marie Tshimula, D'Jeff K. Nkashama, Jean Tshibangu Muabila, René Manassé Galekwa, Hugues Kanda, Maximilien V. Dialufuma, Mbuyi Mukendi Didier, Kalala Kalonji, Serge Mundele, Patience Kinshie Lenye, Tighana Wenge Basele, Aristarque Ilunga, Christian N. Mayemba, Nathanaël M. Kasoro, Selain K. Kasereka, Hardy Mikese, Pierre-Martin Tardif, Marc Frappier, Froduald Kabanza, Belkacem Chikhaoui, Shengrui Wang, Ali Mulenda Sumbu, Xavier Ndona, Raoul Kienge-Kienge Intudi

    Abstract: The increasing sophistication of cyber threats necessitates innovative approaches to cybersecurity. In this paper, we explore the potential of psychological profiling techniques, particularly focusing on the utilization of Large Language Models (LLMs) and psycholinguistic features. We investigate the intersection of psychology and cybersecurity, discussing how LLMs can be employed to analyze textu… ▽ More

    Submitted 28 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  23. arXiv:2406.18532  [pdf, other

    cs.CL cs.AI cs.LG

    Symbolic Learning Enables Self-Evolving Agents

    Authors: Wangchunshu Zhou, Yixin Ou, Shengwei Ding, Long Li, Jialong Wu, Tiannan Wang, Jiamin Chen, Shuai Wang, Xiaohua Xu, Ningyu Zhang, Huajun Chen, Yuchen Eleanor Jiang

    Abstract: The AI community has been exploring a pathway to artificial general intelligence (AGI) by develo** "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that the… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Code available at https://github.com/aiwaves-cn/agents

  24. arXiv:2406.18145  [pdf, other

    cs.CR cs.LG

    Beyond Statistical Estimation: Differentially Private Individual Computation in the Shuffle Model

    Authors: Shaowei Wang, Changyu Dong, Di Wang, Xiangfu Song

    Abstract: The shuffle model of differential privacy (DP) has recently emerged as a powerful one for decentralized computation without fully trustable parties. Since it anonymizes and permutes messages from clients through a shuffler, the privacy can be amplified and utility can be improved. However, the shuffling procedure in turn restricts its applications only to statistical tasks that are permutation-inv… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  25. arXiv:2406.18079  [pdf, other

    cs.CV eess.IV

    MFDNet: Multi-Frequency Deflare Network for Efficient Nighttime Flare Removal

    Authors: Yiguo Jiang, Xuhang Chen, Chi-Man Pun, Shuqiang Wang, Wei Feng

    Abstract: When light is scattered or reflected accidentally in the lens, flare artifacts may appear in the captured photos, affecting the photos' visual quality. The main challenge in flare removal is to eliminate various flare artifacts while preserving the original content of the image. To address this challenge, we propose a lightweight Multi-Frequency Deflare Network (MFDNet) based on the Laplacian Pyra… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted by The Visual Computer journal

  26. arXiv:2406.17707  [pdf, other

    cs.CV cs.RO

    SurgeMOD: Translating image-space tissue motions into vision-based surgical forces

    Authors: Mikel De Iturrate Reyzabal, Dionysios Malas, Shuai Wang, Sebastien Ourselin, Hongbin Liu

    Abstract: We present a new approach for vision-based force estimation in Minimally Invasive Robotic Surgery based on frequency domain basis of motion of organs derived directly from video. Using internal movements generated by natural processes like breathing or the cardiac cycle, we infer the image-space basis of the motion on the frequency domain. As we are working with this representation, we discretize… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  27. arXiv:2406.17626  [pdf, other

    cs.CL cs.AI

    CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference

    Authors: Erxin Yu, **g Li, Ming Liao, Siqi Wang, Zuchen Gao, Fei Mi, Lanqing Hong

    Abstract: As large language models (LLMs) constantly evolve, ensuring their safety remains a critical research problem. Previous red-teaming approaches for LLM safety have primarily focused on single prompt attacks or goal hijacking. To the best of our knowledge, we are the first to study LLM safety in multi-turn dialogue coreference. We created a dataset of 1,400 questions across 14 categories, each featur… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Submitted to EMNLP 2024

  28. arXiv:2406.17565  [pdf, other

    cs.DC

    MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool

    Authors: Cunchen Hu, Heyang Huang, Junhao Hu, Jiang Xu, Xusheng Chen, Tao Xie, Chenxi Wang, Sa Wang, Yungang Bao, Ninghui Sun, Yizhou Shan

    Abstract: Large language model (LLM) serving has transformed from stateless to stateful systems, utilizing techniques like context caching and disaggregated inference. These optimizations extend the lifespan and domain of the KV cache, necessitating a new architectural approach. We present MemServe, a unified system that integrates both inter-request and intra-request optimizations. MemServe introduces MemP… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  29. arXiv:2406.17289  [pdf, other

    cs.IR cs.AI

    Hyperbolic Knowledge Transfer in Cross-Domain Recommendation System

    Authors: Xin Yang, Heng Chang, Zhijian La, **ze Yang, Xingrun Li, Yu Lu, Shuaiqiang Wang, Dawei Yin, Erxue Min

    Abstract: Cross-Domain Recommendation (CDR) seeks to utilize knowledge from different domains to alleviate the problem of data sparsity in the target recommendation domain, and it has been gaining more attention in recent years. Although there have been notable advancements in this area, most current methods represent users and items in Euclidean space, which is not ideal for handling long-tail distributed… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  30. arXiv:2406.17097  [pdf, other

    cs.HC

    Lower Quantity, Higher Quality: Auditing News Content and User Perceptions on Twitter/X Algorithmic versus Chronological Timelines

    Authors: Stephanie Wang, Shengchun Huang, Alvin Zhou, Danaë Metaxa

    Abstract: Social media personalization algorithms increasingly influence the flow of civic information through society, resulting in concerns about "filter bubbles", "echo chambers", and other ways they might exacerbate ideological segregation and fan the spread of polarizing content. To address these concerns, we designed and conducted a sociotechnical audit (STA) to investigate how Twitter/X's timeline al… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 24 pages, 5 figures, Computer-Supported Cooperative Work

  31. arXiv:2406.16850  [pdf, other

    cs.CV cs.RO

    From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking

    Authors: Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang

    Abstract: Embodied agents require robust navigation systems to operate in unstructured environments, making the robustness of Simultaneous Localization and Map** (SLAM) models critical to embodied agent autonomy. While real-world datasets are invaluable, simulation-based benchmarks offer a scalable approach for robustness evaluations. However, the creation of a challenging and controllable noisy world wit… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 50 pages. arXiv admin note: substantial text overlap with arXiv:2402.08125

  32. arXiv:2406.16835  [pdf, other

    cs.HC

    Preserving Real-World Finger Dexterity Using a Lightweight Fingertip Haptic Device for Virtual Dexterous Manipulation

    Authors: Yunxiu XU, Siyu Wang, Shoichi Hasegawa

    Abstract: This study presents a lightweight, wearable fingertip haptic device that provides physics-based haptic feedback for dexterous manipulation in virtual environments without hindering real-world interactions. The device's design utilizes thin strings and actuators attached to the fingernails, minimizing the weight (1.76g each finger) while preserving finger flexibility. Multiple types of haptic feedb… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  33. arXiv:2406.16656  [pdf, ps, other

    cs.IT cs.DM math.CO

    Permutation Codes Correcting Multiple Deletions

    Authors: Shuche Wang, The Nguyen, Yeow Meng Chee, Van Khu Vu

    Abstract: Permutation codes in the Ulam metric, which can correct multiple deletions, have been investigated extensively recently owing to their applications. In this work, we are interested in the maximum size of the permutation codes in the Ulam metric and aim to design permutation codes that can correct multiple deletions with efficient decoding algorithms. We first present an improvement on the Gilbert-… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 9 pages

  34. arXiv:2406.16567  [pdf, other

    cs.CL

    Data Augmentation of Multi-turn Psychological Dialogue via Knowledge-driven Progressive Thought Prompting

    Authors: Jiyue Jiang, Liheng Chen, Sheng Wang, Lingpeng Kong, Yu Li, Chuan Wu

    Abstract: Existing dialogue data augmentation (DA) techniques predominantly focus on augmenting utterance-level dialogues, which makes it difficult to take dialogue contextual information into account. The advent of large language models (LLMs) has simplified the implementation of multi-turn dialogues. Due to absence of professional understanding and knowledge, it remains challenging to deliver satisfactory… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  35. arXiv:2406.16557  [pdf, other

    cs.LG cs.CY

    Efficient k-means with Individual Fairness via Exponential Tilting

    Authors: Shengkun Zhu, **shan Zeng, Yuan Sun, Sheng Wang, Xiaodong Li, Zhiyong Peng

    Abstract: In location-based resource allocation scenarios, the distances between each individual and the facility are desired to be approximately equal, thereby ensuring fairness. Individually fair clustering is often employed to achieve the principle of treating all points equally, which can be applied in these scenarios. This paper proposes a novel algorithm, tilted k-means (TKM), aiming to achieve indivi… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  36. arXiv:2406.16189  [pdf, other

    eess.IV cs.CV

    Fuzzy Attention-based Border Rendering Network for Lung Organ Segmentation

    Authors: Sheng Zhang, Yang Nan, Yingying Fang, Shiyi Wang, Xiaodan Xing, Zhifan Gao, Guang Yang

    Abstract: Automatic lung organ segmentation on CT images is crucial for lung disease diagnosis. However, the unlimited voxel values and class imbalance of lung organs can lead to false-negative/positive and leakage issues in advanced methods. Additionally, some slender lung organs are easily lost during the recycled down/up-sample procedure, e.g., bronchioles & arterioles, causing severe discontinuity issue… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  37. arXiv:2406.15743  [pdf, other

    cs.SE

    CasModaTest: A Cascaded and Model-agnostic Self-directed Framework for Unit Test Generation

    Authors: Chao Ni, Xiaoya Wang, Liushan Chen, Dehai Zhao, Zhengong Cai, Shaohua Wang, Xiaohu Yang

    Abstract: Though many machine learning (ML)-based unit testing generation approaches have been proposed and indeed achieved remarkable performance, they still have several limitations in effectiveness and practical usage. More precisely, existing ML-based approaches (1) generate partial content of a unit test, mainly focusing on test oracle generation; (2) mismatch the test prefix with the test oracle seman… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 14 pages, 7 figures

  38. arXiv:2406.15507  [pdf, other

    cs.CL cs.LG

    Few-shot Knowledge Graph Relational Reasoning via Subgraph Adaptation

    Authors: Haochen Liu, Song Wang, Chen Chen, Jundong Li

    Abstract: Few-shot Knowledge Graph (KG) Relational Reasoning aims to predict unseen triplets (i.e., query triplets) for rare relations in KGs, given only several triplets of these relations as references (i.e., support triplets). This task has gained significant traction due to the widespread use of knowledge graphs in various natural language processing applications. Previous approaches have utilized meta-… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  39. arXiv:2406.15279  [pdf, other

    cs.AI cs.CL

    Cross-Modality Safety Alignment

    Authors: Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, Shimin Li, **lan Fu, Xipeng Qiu, Xuan**g Huang

    Abstract: As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of human life, ensuring the safety and ethical alignment of such systems is paramount. Previous studies primarily focus on single-modality threats, which may not suffice given the integrated and complex nature of cross-modality interactions. We introduce a novel safety alignment challenge called Safe Input… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  40. arXiv:2406.15182  [pdf, other

    cs.CV

    DiffExplainer: Unveiling Black Box Models Via Counterfactual Generation

    Authors: Yingying Fang, Shuang Wu, Zihao **, Caiwen Xu, Shiyi Wang, Simon Walsh, Guang Yang

    Abstract: In the field of medical imaging, particularly in tasks related to early disease detection and prognosis, understanding the reasoning behind AI model predictions is imperative for assessing their reliability. Conventional explanation methods encounter challenges in identifying decisive features in medical image classifications, especially when discriminative features are subtle or not immediately e… ▽ More

    Submitted 26 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  41. arXiv:2406.14962  [pdf, other

    cs.CV

    Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning

    Authors: Suyi Li, Chenyi Jiang, Shidong Wang, Yang Long, Zheng Zhang, Haofeng Zhang

    Abstract: Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addr… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  42. arXiv:2406.14903  [pdf, other

    cs.AI

    GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models

    Authors: Leyan Wang, Yonggang **, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, Wenhao Huang, Jiaheng Liu, Shi Wang, Ge Zhang, Liuyu Xiang, Zhaofeng He

    Abstract: As large language models (LLMs) continue to develop and gain widespread application, the ability of LLMs to exhibit empathy towards diverse group identities and understand their perspectives is increasingly recognized as critical. Most existing benchmarks for empathy evaluation of LLMs focus primarily on universal human emotions, such as sadness and pain, often overlooking the context of individua… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  43. arXiv:2406.14863  [pdf, other

    cs.CR cs.AR

    Older and Wiser: The Marriage of Device Aging and Intellectual Property Protection of Deep Neural Networks

    Authors: Ning Lin, Shaocong Wang, Yue Zhang, Yangu He, Kwunhang Wong, Arindam Basu, Dashan Shang, Xiaoming Chen, Zhongrui Wang

    Abstract: Deep neural networks (DNNs), such as the widely-used GPT-3 with billions of parameters, are often kept secret due to high training costs and privacy concerns surrounding the data used to train them. Previous approaches to securing DNNs typically require expensive circuit redesign, resulting in additional overheads such as increased area, energy consumption, and latency. To address these issues, we… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Design Automation Conference 2024

  44. arXiv:2406.14859  [pdf, other

    cs.CL cs.AI

    From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking

    Authors: Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei

    Abstract: The rapid development of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has exposed vulnerabilities to various adversarial attacks. This paper provides a comprehensive overview of jailbreaking research targeting both LLMs and MLLMs, highlighting recent advancements in evaluation benchmarks, attack techniques and defense strategies. Compared to the more advanced state of… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  45. arXiv:2406.14774  [pdf, other

    cs.LG cs.CL cs.CV

    Evaluating Numerical Reasoning in Text-to-Image Models

    Authors: Ivana Kajić, Olivia Wiles, Isabela Albuquerque, Matthias Bauer, Su Wang, Jordi Pont-Tuset, Aida Nematzadeh

    Abstract: Text-to-image generative models are capable of producing high-quality images that often faithfully depict concepts described using natural language. In this work, we comprehensively evaluate a range of text-to-image models on numerical reasoning tasks of varying difficulty, and show that even the most advanced models have only rudimentary numerical skills. Specifically, their ability to correctly… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  46. arXiv:2406.14230  [pdf, other

    cs.CL cs.AI cs.CY

    Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing

    Authors: Han Jiang, Xiaoyuan Yi, Zhihua Wei, Shu Wang, Xing Xie

    Abstract: Warning: this paper contains model outputs exhibiting unethical information. Large Language Models (LLMs) have achieved significant breakthroughs, but their generated unethical content poses potential risks. Measuring value alignment of LLMs becomes crucial for their regulation and responsible deployment. Numerous datasets have been constructed to assess social bias, toxicity, and ethics in LLMs,… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Work in progress

  47. arXiv:2406.14194  [pdf, other

    cs.CV cs.AI

    VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model

    Authors: Jie Zhang, Sibo Wang, Xiangkui Cao, Zheng Yuan, Shiguang Shan, Xilin Chen, Wen Gao

    Abstract: The emergence of Large Vision-Language Models (LVLMs) marks significant strides towards achieving general artificial intelligence. However, these advancements are tempered by the outputs that often reflect biases, a concern not yet extensively investigated. Existing benchmarks are not sufficiently comprehensive in evaluating biases due to their limited data scale, single questioning format and nar… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  48. arXiv:2406.14186  [pdf, other

    eess.IV cs.CV

    CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation

    Authors: Tingwei Liu, Miao Zhang, Leiye Liu, Jialong Zhong, Shuyao Wang, Yongri Piao, Huchuan Lu

    Abstract: Recently, the Diffusion Probabilistic Model (DPM)-based methods have achieved substantial success in the field of medical image segmentation. However, most of these methods fail to enable the diffusion model to learn edge features and non-edge features effectively and to inject them efficiently into the diffusion backbone. Additionally, the domain gap between the images features and the diffusion… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted in MICCAI 2024

  49. arXiv:2406.14117  [pdf, other

    cs.IR cs.CL

    An Investigation of Prompt Variations for Zero-shot LLM-based Rankers

    Authors: Shuoqi Sun, Shengyao Zhuang, Shuai Wang, Guido Zuccon

    Abstract: We provide a systematic understanding of the impact of specific components and wordings used in prompts on the effectiveness of rankers based on zero-shot Large Language Models (LLMs). Several zero-shot ranking methods based on LLMs have recently been proposed. Among many aspects, methods differ across (1) the ranking algorithm they implement, e.g., pointwise vs. listwise, (2) the backbone LLMs us… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  50. Dye4AI: Assuring Data Boundary on Generative AI Services

    Authors: Shu Wang, Kun Sun, Yan Zhai

    Abstract: Generative artificial intelligence (AI) is versatile for various applications, but security and privacy concerns with third-party AI vendors hinder its broader adoption in sensitive scenarios. Hence, it is essential for users to validate the AI trustworthiness and ensure the security of data boundaries. In this paper, we present a dye testing system named Dye4AI, which injects crafted trigger data… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.