Skip to main content

Showing 1–50 of 598 results for author: Zhou, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01222  [pdf, other

    cs.RO

    Deep Learning Models for Flap** Fin Unmanned Underwater Vehicle Control System Gait Optimization

    Authors: Brian Zhou, Kamal Viswanath, Jason Geder, Alisha Sharma, Julian Lee

    Abstract: The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages, 20 figures. arXiv admin note: text overlap with arXiv:2310.14135

  2. arXiv:2407.00577  [pdf, other

    cs.RO

    FALCON: Fast Autonomous Aerial Exploration using Coverage Path Guidance

    Authors: Yichen Zhang, Xinyi Chen, Chen Feng, Boyu Zhou, Shaojie Shen

    Abstract: This paper introduces FALCON, a novel Fast Autonomous expLoration framework using COverage path guidaNce, which aims at setting a new performance benchmark in the field of autonomous aerial exploration. Despite recent advancements in the domain, existing exploration planners often suffer from inefficiencies such as frequent revisitations of previously explored regions. FALCON effectively harnesses… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  3. arXiv:2406.19755  [pdf, other

    q-bio.QM cs.AI

    Protein Representation Learning with Sequence Information Embedding: Does it Always Lead to a Better Performance?

    Authors: Yang Tan, Lirong Zheng, Bozitao Zhong, Liang Hong, Bingxin Zhou

    Abstract: Deep learning has become a crucial tool in studying proteins. While the significance of modeling protein structure has been discussed extensively in the literature, amino acid types are typically included in the input as a default operation for many inference tasks. This study demonstrates with structure alignment task that embedding amino acid types in some cases may not help a deep learning mode… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures

  4. arXiv:2406.16928  [pdf, other

    eess.SP cs.LG

    A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification

    Authors: Wei Huang, Ning Wang, Panpan Feng, Haiyan Wang, Zongmin Wang, Bing Zhou

    Abstract: Electrocardiograms (ECG), which record the electrophysiological activity of the heart, have become a crucial tool for diagnosing these diseases. In recent years, the application of deep learning techniques has significantly improved the performance of ECG signal classification. Multi-resolution feature analysis, which captures and processes information at different time scales, can extract subtle… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2406.13228  [pdf, other

    cs.LG cs.AI cs.CR

    AGSOA:Graph Neural Network Targeted Attack Based on Average Gradient and Structure Optimization

    Authors: Yang Chen, Bin Zhou

    Abstract: Graph Neural Networks(GNNs) are vulnerable to adversarial attack that cause performance degradation by adding small perturbations to the graph. Gradient-based attacks are one of the most commonly used methods and have achieved good performance in many attack scenarios. However, current gradient attacks face the problems of easy to fall into local optima and poor attack invisibility. Specifically,… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.13215  [pdf, other

    cs.CV cs.AI

    Neural Residual Diffusion Models for Deep Scalable Vision Generation

    Authors: Zhiyuan Ma, Liangliang Zhao, Biqing Qi, Bowen Zhou

    Abstract: The most advanced diffusion models have recently adopted increasingly deep stacked networks (e.g., U-Net or Transformer) to promote the generative emergence capabilities of vision generation models similar to large language models (LLMs). However, progressively deeper stacked networks will intuitively cause numerical propagation errors and reduce noisy prediction capabilities on generative data, w… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2406.12295  [pdf, other

    cs.CL

    Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding

    Authors: Kaiyan Zhang, Jianyu Wang, Ning Ding, Biqing Qi, Ermo Hua, Xingtai Lv, Bowen Zhou

    Abstract: Large Language Models (LLMs) demonstrate impressive performance in diverse applications, yet they face significant drawbacks, including high inference latency, expensive training cost, and generation of hallucination. Collaborative decoding between large and small language models (SLMs) offers a novel approach to address these challenges. Inspired by dual-process cognitive theory, we integrate the… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  8. arXiv:2406.11914  [pdf, other

    cs.LG cs.ET eess.SP

    Initial Investigation of Kolmogorov-Arnold Networks (KANs) as Feature Extractors for IMU Based Human Activity Recognition

    Authors: Mengxi Liu, Daniel Geißler, Dominique Nshimyimana, Sizhen Bian, Bo Zhou, Paul Lukowicz

    Abstract: In this work, we explore the use of a novel neural network architecture, the Kolmogorov-Arnold Networks (KANs) as feature extractors for sensor-based (specifically IMU) Human Activity Recognition (HAR). Where conventional networks perform a parameterized weighted sum of the inputs at each node and then feed the result into a statically defined nonlinearity, KANs perform non-linear computations rep… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: This paper is under review

  9. arXiv:2406.11243  [pdf, other

    cs.CL cs.AI

    FamiCom: Further Demystifying Prompts for Language Models with Task-Agnostic Performance Estimation

    Authors: Bangzheng Li, Ben Zhou, Xingyu Fu, Fei Wang, Dan Roth, Muhao Chen

    Abstract: Language models have shown impressive in-context-learning capabilities, which allow them to benefit from input prompts and perform better on downstream end tasks. Existing works investigate the mechanisms behind this observation, and propose label-agnostic prompt metrics that can better estimate end-task performances. One popular approach is using perplexity as a way to measure models' familiarity… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  10. arXiv:2406.09386  [pdf, other

    cs.CV

    SimGen: Simulator-conditioned Driving Scene Generation

    Authors: Yunsong Zhou, Michael Simon, Zhenghao Peng, Sicheng Mo, Hongzi Zhu, Minyi Guo, Bolei Zhou

    Abstract: Controllable synthetic data generation can substantially lower the annotation cost of training data in autonomous driving research and development. Prior works use diffusion models to generate driving images conditioned on the 3D object layout. However, those models are trained on small-scale datasets like nuScenes, which lack appearance and layout diversity. Moreover, the trained models can only… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  11. arXiv:2406.08374  [pdf, other

    cs.CV cs.AI eess.IV

    2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

    Authors: Tianqi Chen, Jun Hou, Yinchi Zhou, Huidong Xie, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, James S. Duncan, Chi Liu, Bo Zhou

    Abstract: Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation hazards to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate t… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 7 figures

  12. arXiv:2406.07540  [pdf, other

    cs.CV cs.LG

    Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

    Authors: Kuan Heng Lin, Sicheng Mo, Ben Klingher, Fangzhou Mu, Bolei Zhou

    Abstract: Recent controllable generation approaches such as FreeControl and Diffusion Self-guidance bring fine-grained spatial and appearance control to text-to-image (T2I) diffusion models without training auxiliary modules. However, these methods optimize the latent embedding for each type of score function with longer diffusion steps, making the generation process time-consuming and limiting their flexib… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures, see project page at https://genforce.github.io/ctrl-x

  13. arXiv:2406.07294  [pdf, other

    cs.RO cs.CV

    OTO Planner: An Efficient Only Travelling Once Exploration Planner for Complex and Unknown Environments

    Authors: Bo Zhou, Chuanzhao Lu, Yan Pan, Fu Chen

    Abstract: Autonomous exploration in complex and cluttered environments is essential for various applications. However, there are many challenges due to the lack of global heuristic information. Existing exploration methods suffer from the repeated paths and considerable computational resource requirement in large-scale environments. To address the above issues, this letter proposes an efficient exploration… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  14. arXiv:2406.05534  [pdf, other

    cs.AI cs.CL cs.LG

    Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

    Authors: Biqing Qi, Pengfei Li, Fangyuan Li, Junqi Gao, Kaiyan Zhang, Bowen Zhou

    Abstract: Direct Preference Optimization (DPO) improves the alignment of large language models (LLMs) with human values by training directly on human preference datasets, eliminating the need for reward models. However, due to the presence of cross-domain human preferences, direct continual training can lead to catastrophic forgetting, limiting DPO's performance and efficiency. Inspired by intraspecific com… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  15. arXiv:2406.05532  [pdf, other

    cs.LG cs.AI

    Exploring Adversarial Robustness of Deep State Space Models

    Authors: Biqing Qi, Yang Luo, Junqi Gao, Pengfei Li, Kai Tian, Zhiyuan Ma, Bowen Zhou

    Abstract: Deep State Space Models (SSMs) have proven effective in numerous task scenarios but face significant security challenges due to Adversarial Perturbations (APs) in real-world deployments. Adversarial Training (AT) is a mainstream approach to enhancing Adversarial Robustness (AR) and has been validated on various traditional DNN architectures. However, its effectiveness in improving the AR of SSMs r… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  16. arXiv:2406.05531  [pdf, other

    cs.LG cs.AI

    Enhancing Adversarial Transferability via Information Bottleneck Constraints

    Authors: Biqing Qi, Junqi Gao, Jianxing Liu, Ligang Wu, Bowen Zhou

    Abstract: From the perspective of information bottleneck (IB) theory, we propose a novel framework for performing black-box transferable adversarial attacks named IBTA, which leverages advancements in invariant features. Intuitively, diminishing the reliance of adversarial perturbations on the original data, under equivalent attack performance constraints, encourages a greater reliance on invariant features… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Journal ref: IEEE Signal Processing Letters, 2024

  17. arXiv:2406.03949  [pdf, other

    cs.CL

    UltraMedical: Building Specialized Generalists in Biomedicine

    Authors: Kaiyan Zhang, Sihang Zeng, Ermo Hua, Ning Ding, Zhang-Ren Chen, Zhiyuan Ma, Haoxin Li, Ganqu Cui, Biqing Qi, Xuekai Zhu, Xingtai Lv, Hu **fang, Zhiyuan Liu, Bowen Zhou

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains and are moving towards more specialized areas. Recent advanced proprietary models such as GPT-4 and Gemini have achieved significant advancements in biomedicine, which have also raised privacy and security challenges. The construction of specialized generalists hinges largely on high-quality datasets, enh… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Datasets and models are available at https://github.com/TsinghuaC3I/UltraMedical

  18. arXiv:2406.01646  [pdf, other

    cs.LG cs.AI eess.SP

    iKAN: Global Incremental Learning with KAN for Human Activity Recognition Across Heterogeneous Datasets

    Authors: Mengxi Liu, Sizhen Bian, Bo Zhou, Paul Lukowicz

    Abstract: This work proposes an incremental learning (IL) framework for wearable sensor human activity recognition (HAR) that tackles two challenges simultaneously: catastrophic forgetting and non-uniform inputs. The scalable framework, iKAN, pioneers IL with Kolmogorov-Arnold Networks (KAN) to replace multi-layer perceptrons as the classifier that leverages the local plasticity and global stability of spli… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: This work is submitted to Ubicomp/ISWC24 and is under review

  19. arXiv:2406.00504  [pdf

    cs.RO cs.AI

    Research on an Autonomous UAV Search and Rescue System Based on the Improved

    Authors: Haobin Chen, Junyu Tao, Bize Zhou, Xiaoyan Liu

    Abstract: The demand is to solve the issue of UAV (unmanned aerial vehicle) operating autonomously and implementing practical functions such as search and rescue in complex unknown environments. This paper proposes an autonomous search and rescue UAV system based on an EGO-Planner algorithm, which is improved by innovative UAV body application and takes the methods of inverse motor backstep** to enhance t… ▽ More

    Submitted 7 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: 2024 5th International Conference on Computer Engineering and Application

  20. arXiv:2405.18424  [pdf, other

    cs.CV

    3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

    Authors: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang

    Abstract: Scene image editing is crucial for entertainment, photography, and advertising design. Existing methods solely focus on either 2D individual object or 3D global scene editing. This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity. In this work, we propose 3DitScene, a novel and unified scene editing framework… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  21. arXiv:2405.17534  [pdf, other

    cs.LG

    SMR: State Memory Replay for Long Sequence Modeling

    Authors: Biqing Qi, Junqi Gao, Kaiyan Zhang, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou

    Abstract: Despite the promising performance of state space models (SSMs) in long sequence modeling, limitations still exist. Advanced SSMs like S5 and S6 (Mamba) in addressing non-uniform sampling, their recursive structures impede efficient SSM computation via convolution. To overcome compatibility limitations in parallel convolutional computation, this paper proposes a novel non-recursive non-uniform samp… ▽ More

    Submitted 8 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Journal ref: Findings of the Association for Computational Linguistics, 2024

  22. arXiv:2405.16501  [pdf, other

    cs.CV

    User-Friendly Customized Generation with Multi-Modal Prompts

    Authors: Linhao Zhong, Yan Hong, Wentao Chen, Binglin Zhou, Yiyi Zhang, Jianfu Zhang, Liqing Zhang

    Abstract: Text-to-image generation models have seen considerable advancement, catering to the increasing interest in personalized image creation. Current customization techniques often necessitate users to provide multiple images (typically 3-5) for each customized object, along with the classification of these objects and descriptive textual prompts for scenes. This paper questions whether the process can… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  23. arXiv:2405.14866  [pdf, other

    cs.CV

    Tele-Aloha: A Low-budget and High-authenticity Telepresence System Using Sparse RGB Cameras

    Authors: Hanzhang Tu, Ruizhi Shao, Xue Dong, Shunyuan Zheng, Hao Zhang, Lili Chen, Meili Wang, Wenyu Li, Siyan Ma, Sheng** Zhang, Boyao Zhou, Yebin Liu

    Abstract: In this paper, we present a low-budget and high-authenticity bidirectional telepresence system, Tele-Aloha, targeting peer-to-peer communication scenarios. Compared to previous systems, Tele-Aloha utilizes only four sparse RGB cameras, one consumer-grade GPU, and one autostereoscopic screen to achieve high-resolution (2048x2048), real-time (30 fps), low-latency (less than 150ms) and robust distant… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Paper accepted by SIGGRAPH 2024. Project page: http://118.178.32.38/c/Tele-Aloha/

  24. arXiv:2405.13584  [pdf, other

    cs.LG cs.DC

    Emulating Full Client Participation: A Long-Term Client Selection Strategy for Federated Learning

    Authors: Qingming Li, Juzheng Miao, Puning Zhao, Li Zhou, Shouling Ji, Bowen Zhou, Furui Liu

    Abstract: Client selection significantly affects the system convergence efficiency and is a crucial problem in federated learning. Existing methods often select clients by evaluating each round individually and overlook the necessity for long-term optimization, resulting in suboptimal performance and potential fairness issues. In this study, we propose a novel client selection strategy designed to emulate t… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  25. arXiv:2405.12223  [pdf, other

    eess.IV cs.CV

    Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

    Authors: Yinchi Zhou, Tianqi Chen, Jun Hou, Huidong Xie, Nicha C. Dvornek, S. Kevin Zhou, David L. Wilson, James S. Duncan, Chi Liu, Bo Zhou

    Abstract: Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their c… ▽ More

    Submitted 5 April, 2024; originally announced May 2024.

    Comments: 15 pages, 5 figures

  26. arXiv:2405.11870  [pdf, other

    cs.CL cs.AI

    Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process

    Authors: Ermo Hua, Biqing Qi, Kaiyan Zhang, Yue Yu, Ning Ding, Xingtai Lv, Kai Tian, Bowen Zhou

    Abstract: Supervised Fine-Tuning (SFT) and Preference Optimization (PO) are two fundamental processes for enhancing the capabilities of Language Models (LMs) post pre-training, aligning them better with human preferences. Although SFT advances in training efficiency, PO delivers better alignment, thus they are often combined. However, common practices simply apply them sequentially without integrating their… ▽ More

    Submitted 28 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  27. arXiv:2405.11788  [pdf, other

    cs.LG

    TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models

    Authors: Junlong Jia, Ying Hu, Xi Weng, Yiming Shi, Miao Li, Xingjian Zhang, Baichuan Zhou, Ziyu Liu, Jie Luo, Lei Huang, Ji Wu

    Abstract: We present TinyLLaVA Factory, an open-source modular codebase for small-scale large multimodal models (LMMs) with a focus on simplicity of code implementations, extensibility of new features, and reproducibility of training results. Following the design philosophy of the factory pattern in software engineering, TinyLLaVA Factory modularizes the entire system into interchangeable components, with e… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Our codebase is made public at https://github.com/TinyLLaVA/TinyLLaVA_Factory with documentation available at https://tinyllava-factory.readthedocs.io/en/latest/

  28. arXiv:2405.10051  [pdf, other

    cs.CR cs.CL

    MarkLLM: An Open-Source Toolkit for LLM Watermarking

    Authors: Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King

    Abstract: LLM watermarking, which embeds imperceptible yet algorithmically detectable signals in model outputs to identify LLM-generated text, has become crucial in mitigating the potential misuse of large language models. However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community… ▽ More

    Submitted 24 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: 16 pages, 5 figures, 6 tables

    MSC Class: 68T50 ACM Class: I.2.7

  29. arXiv:2405.09050  [pdf, other

    cs.CV

    3D Shape Augmentation with Content-Aware Shape Resizing

    Authors: Mingxiang Chen, Jian Zhang, Boli Zhou, Yang Song

    Abstract: Recent advancements in deep learning for 3D models have propelled breakthroughs in generation, detection, and scene understanding. However, the effectiveness of these algorithms hinges on large training datasets. We address the challenge by introducing Efficient 3D Seam Carving (E3SC), a novel 3D model augmentation method based on seam carving, which progressively deforms only part of the input mo… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  30. arXiv:2405.08816  [pdf, other

    cs.CV cs.RO

    The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

    Authors: Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Yaru Niu, Wei Tsang Ooi, Benoit R. Cottereau, Lai Xing Ng, Yuexin Ma, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Weichao Qiu, Wei Zhang, Xu Cao, Hao Lu, Ying-Cong Chen, Caixin Kang, Xinning Zhou, Chengyang Ying, Wentao Shang, Xingxing Wei, Yinpeng Dong, Bo Yang, Shengyin Jiang , et al. (66 additional authors not shown)

    Abstract: In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that c… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: ICRA 2024; 32 pages, 24 figures, 5 tables; Code at https://robodrive-24.github.io/

  31. arXiv:2405.03995  [pdf, other

    cs.CV

    Deep Event-based Object Detection in Autonomous Driving: A Survey

    Authors: Bingquan Zhou, Jie Jiang

    Abstract: Object detection plays a critical role in autonomous driving, where accurately and efficiently detecting objects in fast-moving scenes is crucial. Traditional frame-based cameras face challenges in balancing latency and bandwidth, necessitating the need for innovative solutions. Event cameras have emerged as promising sensors for autonomous driving due to their low latency, high dynamic range, and… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  32. arXiv:2405.00344  [pdf, other

    cs.MM

    Expert Insight-Enhanced Follow-up Chest X-Ray Summary Generation

    Authors: Zhichuan Wang, Kinhei Lee, Qiao Deng, Tiffany Y. So, Wan Hang Chiu, Yeung Yu Hui, Bing**g Zhou, Edward S. Hui

    Abstract: A chest X-ray radiology report describes abnormal findings not only from X-ray obtained at current examination, but also findings on disease progression or change in device placement with reference to the X-ray from previous examination. Majority of the efforts on automatic generation of radiology report pertain to reporting the former, but not the latter, type of findings. To the best of the auth… ▽ More

    Submitted 6 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: accepted by 22nd International Conference on Artificial Intelligence in medicine (AIME2024)

    ACM Class: I.2.1

  33. arXiv:2404.15999  [pdf, other

    cs.LG eess.SP

    BeSound: Bluetooth-Based Position Estimation Enhancing with Cross-Modality Distillation

    Authors: Hymalai Bello, Sungho Suh, Bo Zhou, Paul Lukowicz

    Abstract: Smart factories leverage advanced technologies to optimize manufacturing processes and enhance efficiency. Implementing worker tracking systems, primarily through camera-based methods, ensures accurate monitoring. However, concerns about worker privacy and technology protection make it necessary to explore alternative approaches. We propose a non-visual, scalable solution using Bluetooth Low Energ… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted in IEEE 6th International Conference on Activity and Behavior Computing

  34. arXiv:2404.14850  [pdf, other

    cs.CL cs.LG q-bio.BM

    Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

    Authors: Yang Tan, Mingchen Li, Bingxin Zhou, Bozitao Zhong, Lirong Zheng, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong

    Abstract: Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfe… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 30 pages, 4 figures, 8 tables

  35. arXiv:2404.12494  [pdf, other

    cs.CL

    BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models

    Authors: Yu Feng, Ben Zhou, Weidong Lin, Dan Roth

    Abstract: Large language models primarily rely on inductive reasoning for decision making. This results in unreliable decisions when applied to real-world tasks that often present incomplete contexts and conditions. Thus, accurate probability estimation and appropriate interpretations are required to enhance decision-making reliability. In this paper, we propose a Bayesian inference framework called BIRD fo… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  36. arXiv:2404.11677  [pdf, other

    cs.AI

    Cross-Problem Learning for Solving Vehicle Routing Problems

    Authors: Zhuoyi Lin, Yaoxin Wu, Bangjian Zhou, Zhiguang Cao, Wen Song, Yingqian Zhang, Senthilnath Jayavelu

    Abstract: Existing neural heuristics often train a deep architecture from scratch for each specific vehicle routing problem (VRP), ignoring the transferable knowledge across different VRP variants. This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants. Particularly, we modularize neural architectures for complex VRPs into 1) the backbone Transform… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI'24

  37. arXiv:2404.09463  [pdf

    cs.LG

    PRIME: A CyberGIS Platform for Resilience Inference Measurement and Enhancement

    Authors: Debayan Mandal, Dr. Lei Zou, Rohan Singh Wilkho, Joynal Abedin, Bing Zhou, Dr. Heng Cai, Dr. Furqan Baig, Dr. Nasir Gharaibeh, Dr. Nina Lam

    Abstract: In an era of increased climatic disasters, there is an urgent need to develop reliable frameworks and tools for evaluating and improving community resilience to climatic hazards at multiple geographical and temporal scales. Defining and quantifying resilience in the social domain is relatively subjective due to the intricate interplay of socioeconomic factors with disaster resilience. Meanwhile, t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 28 pages, 6 figures

  38. arXiv:2404.07779  [pdf, other

    cs.SI physics.soc-ph

    Improving Network Degree Correlation by Degree-preserving Rewiring

    Authors: Shuo Zou, Bo Zhou, Qi Xuan

    Abstract: Degree correlation is a crucial measure in networks, significantly impacting network topology and dynamical behavior. The degree sequence of a network is a significant characteristic, and altering network degree correlation through degree-preserving rewiring poses an interesting problem. In this paper, we define the problem of maximizing network degree correlation through a finite number of rewiri… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  39. arXiv:2404.02078  [pdf, other

    cs.AI cs.CL cs.LG

    Advancing LLM Reasoning Generalists with Preference Trees

    Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

    Abstract: We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning. Finetuned from Mistral-7B and CodeLlama-70B, Eurus models achieve state-of-the-art results among open-source models on a diverse set of benchmarks covering mathematics, code generation, and logical reasoning problems. Notably, Eurus-70B beats GPT-3.5 Turbo in reasoning through a comprehensive benchmarking across 1… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Models and data are available at https://github.com/OpenBMB/Eurus

  40. arXiv:2404.00598  [pdf, other

    cs.IT eess.SP

    Robust Beamforming Design and Antenna Selection for Dynamic HRIS-aided Massive MIMO Systems

    Authors: **tao Wang, Binggui Zhou, Chengzhi Ma, Shiqi Gong, Guanghua Yang, Shaodan Ma

    Abstract: In this paper, a dynamic hybrid active-passive reconfigurable intelligent surface (HRIS) is proposed to further enhance the massive multiple-input-multiple-output (MIMO) system, since it supports the dynamic placement of active and passive elements. Specifically, considering the impact of the hardware impairments (HWIs), we investigate the channel-aware configuration of the receive antennas at the… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 5 pages, 2 figures

  41. arXiv:2404.00292  [pdf, other

    cs.CV

    LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

    Authors: Pancheng Zhao, Peng Xu, Pengda Qin, Deng-** Fan, Zhicheng Zhang, Guoli Jia, Bowen Zhou, Jufeng Yang

    Abstract: Camouflaged vision perception is an important vision task with numerous practical applications. Due to the expensive collection and labeling costs, this community struggles with a major bottleneck that the species category of its datasets is limited to a small number of object species. However, the existing camouflaged generation methods require specifying the background manually, thus failing to… ▽ More

    Submitted 12 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024, Fig.3 revised

  42. arXiv:2404.00205  [pdf, other

    cs.CL

    Conceptual and Unbiased Reasoning in Language Models

    Authors: Ben Zhou, Hongming Zhang, Sihao Chen, Dian Yu, Hongwei Wang, Baolin Peng, Dan Roth, Dong Yu

    Abstract: Conceptual reasoning, the ability to reason in abstract and high-level perspectives, is key to generalization in human cognition. However, limited study has been done on large language models' capability to perform conceptual reasoning. In this work, we bridge this gap and propose a novel conceptualization framework that forces models to perform conceptual reasoning on abstract questions and gener… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Preprint under review

  43. arXiv:2403.20009  [pdf, other

    cs.CL cs.LG

    On Large Language Models' Hallucination with Regard to Known Facts

    Authors: Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou

    Abstract: Large language models are successful in answering factoid questions but are also prone to hallucination.We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations.We are able to conduct this analysis via two key ideas.First, we identify the factual question… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted by NAACL 2024 MainConference

  44. arXiv:2403.19094  [pdf, other

    cs.CL

    Learning From Correctness Without Prompting Makes LLM Efficient Reasoner

    Authors: Yuxuan Yao, Han Wu, Zhijiang Guo, Biyan Zhou, Jiahui Gao, Sichun Luo, Hanxu Hou, Xiao** Fu, Linqi Song

    Abstract: Large language models (LLMs) have demonstrated outstanding performance across various tasks, yet they still exhibit limitations such as hallucination, unfaithful reasoning, and toxic content. One potential approach to mitigate these issues is learning from human or external feedback (e.g. tools). In this paper, we introduce an intrinsic self-correct reasoning framework for LLMs that eliminates the… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  45. arXiv:2403.18433  [pdf, other

    cs.HC

    iFace: Hand-Over-Face Gesture Recognition Leveraging Impedance Sensing

    Authors: Mengxi Liu, Hymalai Bello, Bo Zhou, Paul Lukowicz, Jakob Karolus

    Abstract: Hand-over-face gestures can provide important implicit interactions during conversations, such as frustration or excitement. However, in situations where interlocutors are not visible, such as phone calls or textual communication, the potential meaning contained in the hand-over-face gestures is lost. In this work, we present iFace, an unobtrusive, wearable impedance-sensing solution for recognizi… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted by Augmented Humans 2024

  46. arXiv:2403.16560  [pdf, other

    cs.RO

    Active Admittance Control with Iterative Learning for General-Purpose Contact-Rich Manipulation

    Authors: Bo Zhou, Yuyao Sun, Wenbo Liu, Ruixuan Jiao, Fang Fang, Shihua Li

    Abstract: Force interaction is inevitable when robots face multiple operation scenarios. How to make the robot competent in force control for generalized operations such as multi-tasks still remains a challenging problem. Aiming at the reproducibility of interaction tasks and the lack of a generalized force control framework for multi-task scenarios, this paper proposes a novel hybrid control framework base… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  47. arXiv:2403.12556  [pdf, other

    cs.CL

    Factorized Learning Assisted with Large Language Model for Gloss-free Sign Language Translation

    Authors: Zhigang Chen, Benjia Zhou, Jun Li, Jun Wan, Zhen Lei, Ning Jiang, Quan Lu, Guoqing Zhao

    Abstract: Previous Sign Language Translation (SLT) methods achieve superior performance by relying on gloss annotations. However, labeling high-quality glosses is a labor-intensive task, which limits the further development of SLT. Although some approaches work towards gloss-free SLT through jointly training the visual encoder and translation network, these efforts still suffer from poor performance and ine… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING-2024

  48. arXiv:2403.11697  [pdf, other

    cs.CV

    Urban Scene Diffusion through Semantic Occupancy Map

    Authors: Junge Zhang, Qihang Zhang, Li Zhang, Ramana Rao Kompella, Gaowen Liu, Bolei Zhou

    Abstract: Generating unbounded 3D scenes is crucial for large-scale scene understanding and simulation. Urban scenes, unlike natural landscapes, consist of various complex man-made objects and structures such as roads, traffic signs, vehicles, and buildings. To create a realistic and detailed urban scene, it is crucial to accurately represent the geometry and semantics of the underlying objects, going beyon… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: The project website is https://metadriverse.github.io/urbandiff/

  49. arXiv:2403.11681  [pdf, other

    cs.RO cs.CV

    MASSTAR: A Multi-Modal and Large-Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and Completion

    Authors: Guiyong Zheng, **qi Jiang, Chen Feng, Shaojie Shen, Boyu Zhou

    Abstract: Surface prediction and completion have been widely studied in various applications. Recently, research in surface completion has evolved from small objects to complex large-scale scenes. As a result, researchers have begun increasing the volume of data and leveraging a greater variety of data modalities including rendered RGB images, descriptive texts, depth images, etc, to enhance algorithm perfo… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Submitted to IROS2024. Code: https://github.com/SYSU-STAR/MASSTAR. Project Page: https://github.com/SYSU-STAR/MASSTAR

  50. arXiv:2403.10821  [pdf, other

    cs.RO

    H3-Map**: Quasi-Heterogeneous Feature Grids for Real-time Dense Map** Using Hierarchical Hybrid Representation

    Authors: Chenxing Jiang, Yiming Luo, Boyu Zhou, Shaojie Shen

    Abstract: In recent years, implicit online dense map** methods have achieved high-quality reconstruction results, showcasing great potential in robotics, AR/VR, and digital twins applications. However, existing methods struggle with slow texture modeling which limits their real-time performance. To address these limitations, we propose a NeRF-based dense map** method that enables faster and higher-quali… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 8 pages, 11 figures, submitted to IEEE Robotics and Automation Letters