Skip to main content

Showing 1–50 of 82 results for author: Lei, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18192  [pdf, other

    cs.CL cs.AI

    Methodology of Adapting Large English Language Models for Specific Cultural Contexts

    Authors: Wen**g Zhang, Siqi Xiao, Xuejiao Lei, Ning Wang, Huazheng Zhang, Meijuan An, Bikun Yang, Zhaoxiang Liu, Kai Wang, Shiguo Lian

    Abstract: The rapid growth of large language models(LLMs) has emerged as a prominent trend in the field of artificial intelligence. However, current state-of-the-art LLMs are predominantly based on English. They encounter limitations when directly applied to tasks in specific cultural domains, due to deficiencies in domain-specific knowledge and misunderstandings caused by differences in cultural values. To… ▽ More

    Submitted 26 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 11 pages, 2 figures

  2. arXiv:2406.14343  [pdf, other

    cs.AI

    iWISDM: Assessing instruction following in multimodal models at scale

    Authors: Xiaoxuan Lei, Lucas Gomez, Hao Yuan Bai, Pouya Bashivan

    Abstract: The ability to perform complex tasks from detailed instructions is a key to many remarkable achievements of our species. As humans, we are not only capable of performing a wide variety of tasks but also very complex ones that may entail hundreds or thousands of steps to complete. Large language models and their more recent multimodal counterparts that integrate textual and visual inputs have achie… ▽ More

    Submitted 25 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.10311  [pdf, other

    cs.CL cs.AI

    CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models

    Authors: Wen**g Zhang, Xuejiao Lei, Zhaoxiang Liu, Meijuan An, Bikun Yang, KaiKai Zhao, Kai Wang, Shiguo Lian

    Abstract: With the profound development of large language models(LLMs), their safety concerns have garnered increasing attention. However, there is a scarcity of Chinese safety benchmarks for LLMs, and the existing safety taxonomies are inadequate, lacking comprehensive safety detection capabilities in authentic Chinese scenarios. In this work, we introduce CHiSafetyBench, a dedicated safety benchmark for e… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 13 pages, 3 figures

  4. arXiv:2406.10307  [pdf, other

    cs.CL cs.AI

    What is the best model? Application-driven Evaluation for Large Language Models

    Authors: Shiguo Lian, Kaikai Zhao, Xinhui Liu, Xuejiao Lei, Bikun Yang, Wen**g Zhang, Kai Wang, Zhaoxiang Liu

    Abstract: General large language models enhanced with supervised fine-tuning and reinforcement learning from human feedback are increasingly popular in academia and industry as they generalize foundation models to various practical tasks in a prompt manner. To assist users in selecting the best model in practical application scenarios, i.e., choosing the model that meets the application requirements while m… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2405.03008  [pdf, other

    eess.IV cs.CV cs.LG

    DVMSR: Distillated Vision Mamba for Efficient Super-Resolution

    Authors: Xiaoyan Lei, Wenlong Zhang, Weifeng Cao

    Abstract: Efficient Image Super-Resolution (SR) aims to accelerate SR network inference by minimizing computational complexity and network parameters while preserving performance. Existing state-of-the-art Efficient Image Super-Resolution methods are based on convolutional neural networks. Few attempts have been made with Mamba to harness its long-range modeling capability and efficient computational comple… ▽ More

    Submitted 11 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: 8 pages, 8 figures

  6. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  7. arXiv:2404.03491  [pdf, other

    cs.CL cs.AI

    A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation

    Authors: Jifan Yu, Xiaohan Zhang, Yifan Xu, Xuanyu Lei, Zijun Yao, **g Zhang, Lei Hou, Juanzi Li

    Abstract: Empowered by the large-scale pretrained language models, existing dialogue systems have demonstrated impressive performance conducting fluent and natural-sounding conversations. However, they are still plagued by the hallucination problem, causing unpredictable factual errors in the generated responses. Recently, knowledge-grounded dialogue generation models, that intentionally invoke external kno… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted by LREC-COLING 2024

  8. arXiv:2404.00674  [pdf, other

    cs.CV

    Knowledge NeRF: Few-shot Novel View Synthesis for Dynamic Articulated Objects

    Authors: Wenxiao Cai, Xinyue Lei, Xinyu He, Junming Leo Chen, Yangang Wang

    Abstract: We present Knowledge NeRF to synthesize novel views for dynamic scenes. Reconstructing dynamic 3D scenes from few sparse views and rendering them from arbitrary perspectives is a challenging problem with applications in various domains. Previous dynamic NeRF methods learn the deformation of articulated objects from monocular videos. However, qualities of their reconstructed scenes are limited. To… ▽ More

    Submitted 6 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  9. arXiv:2403.16477  [pdf, other

    cs.IT eess.SP

    Safeguarding Next Generation Multiple Access Using Physical Layer Security Techniques: A Tutorial

    Authors: Lu Lv, Dongyang Xu, Rose Qingyang Hu, Yinghui Ye, Long Yang, Xianfu Lei, Xianbin Wang, Dong In Kim, Arumugam Nallanathan

    Abstract: Driven by the ever-increasing requirements of ultra-high spectral efficiency, ultra-low latency, and massive connectivity, the forefront of wireless research calls for the design of advanced next generation multiple access schemes to facilitate provisioning of these stringent demands. This inspires the embrace of non-orthogonal multiple access (NOMA) in future wireless communication networks. Neve… ▽ More

    Submitted 21 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Invited paper by Proceedings of the IEEE

  10. arXiv:2403.11625  [pdf, other

    cs.CV

    GaussNav: Gaussian Splatting for Visual Navigation

    Authors: Xiaohan Lei, Min Wang, Wengang Zhou, Houqiang Li

    Abstract: In embodied vision, Instance ImageGoal Navigation (IIN) requires an agent to locate a specific object depicted in a goal image within an unexplored environment. The primary difficulty of IIN stems from the necessity of recognizing the target object across varying viewpoints and rejecting potential distractors. Existing map-based navigation methods largely adopt the representation form of Bird's… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: conference

  11. arXiv:2403.06352  [pdf

    cs.CV

    Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing Systems

    Authors: Xing Lei, Longjun Liu, Zhiheng Zhou, Hongbin Sun, Nanning Zheng

    Abstract: In this paper, we explore how to design lightweight CNN architecture for embedded computing systems. We propose L-Mobilenet model for ZYNQ based hardware platform. L-Mobilenet can adapt well to the hardware computing and accelerating, and its network structure is inspired by the state-of-the-art work of Inception-ResnetV1 and MobilenetV2, which can effectively reduce parameters and delay while mai… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  12. arXiv:2402.17587  [pdf, other

    cs.CV cs.RO

    Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

    Authors: Xiaohan Lei, Min Wang, Wengang Zhou, Li Li, Houqiang Li

    Abstract: As a new embodied vision task, Instance ImageGoal Navigation (IIN) aims to navigate to a specified object depicted by a goal image in an unexplored environment. The main challenge of this task lies in identifying the target object from different viewpoints while rejecting similar distractors. Existing ImageGoal Navigation methods usually adopt the simple Exploration-Exploitation framework and… ▽ More

    Submitted 22 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  13. arXiv:2402.12058  [pdf, other

    cs.CV cs.CL

    Scaffolding Coordinates to Promote Vision-Language Coordination in Large Multi-Modal Models

    Authors: Xuanyu Lei, Zonghan Yang, Xinrui Chen, Peng Li, Yang Liu

    Abstract: State-of-the-art Large Multi-Modal Models (LMMs) have demonstrated exceptional capabilities in vision-language tasks. Despite their advanced functionalities, the performances of LMMs are still limited in challenging scenarios that require complex reasoning with multiple levels of visual information. Existing prompting techniques for LMMs focus on either improving textual reasoning or leveraging to… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  14. arXiv:2402.08492  [pdf

    cs.AI

    The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale

    Authors: Xiaoqiang Liu, Yubin Wang, Zicheng Huang, Boming Xu, Yilin Zeng, Xinqi Chen, Zilong Wang, Enning Yang, Xiaoxuan Lei, Yisen Huang, Xiaobo Liu

    Abstract: Background: Colonoscopy, a crucial diagnostic tool in gastroenterology, depends heavily on superior bowel preparation. ChatGPT, a large language model with emergent intelligence which also exhibits potential in medical applications. This study aims to assess the accuracy and consistency of ChatGPT in using the Boston Bowel Preparation Scale (BBPS) for colonoscopy assessment. Methods: We retrospect… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  15. arXiv:2401.04283  [pdf, ps, other

    eess.AS cs.SD

    FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

    Authors: Yang Liu, Li Wan, Yun Li, Yiteng Huang, Ming Sun, James Luan, Yangyang Shi, Xin Lei

    Abstract: Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted. In this paper, we propose DI-AEC, pioneering a diffusion-based stochastic regeneration approach dedicated to AEC. Further, we propose FADI-AEC, fast score-based diffusion AEC framework to save computational demands, making it favorable for edge devices. It stan… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  16. arXiv:2312.15006  [pdf, other

    cs.AI cs.CL cs.LG

    Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities

    Authors: Yuhao Chen, Chloe Wong, Hanwen Yang, Juan Aguenza, Sai Bhujangari, Benthan Vu, Xun Lei, Amisha Prasad, Manny Fluss, Eric Phuong, Minghao Liu, Raja Kumar, Vanshika Vats, James Davis

    Abstract: This study critically evaluates the efficacy of prompting methods in enhancing the mathematical reasoning capability of large language models (LLMs). The investigation uses three prescriptive prompting methods - simple, persona, and conversational prompting - known for their effectiveness in enhancing the linguistic tasks of LLMs. We conduct this analysis on OpenAI's LLM chatbot, ChatGPT-3.5, on e… ▽ More

    Submitted 20 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  17. arXiv:2312.10593  [pdf, other

    cs.CR eess.SP

    A Novel RFID Authentication Protocol Based on A Block-Order-Modulus Variable Matrix Encryption Algorithm

    Authors: Yan Wang, Ruiqi Liu, Tong Gao, Feng Shu, Xuemei Lei, Guan Gui, Jiangzhou Wang

    Abstract: In this paper, authentication for mobile radio frequency identification (RFID) systems with low-cost tags is studied. Firstly, an adaptive modulus (AM) encryption algorithm is proposed. Subsequently, in order to enhance the security without additional storage of new key matrices, a self-updating encryption order (SUEO) algorithm is designed. Furthermore, a diagonal block local transpose key matrix… ▽ More

    Submitted 9 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  18. arXiv:2311.18743  [pdf, other

    cs.CL cs.AI cs.LG

    AlignBench: Benchmarking Chinese Alignment of Large Language Models

    Authors: Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, **g Zhang, Minlie Huang, Yuxiao Dong, Jie Tang

    Abstract: Alignment has become a critical step for instruction-tuned Large Language Models (LLMs) to become helpful assistants. However, effective evaluation of alignment for emerging Chinese LLMs is still significantly lacking, calling for real-scenario grounded, open-ended, challenging and automatic evaluations tailored for alignment. To fill in this gap, we introduce AlignBench, a comprehensive multi-dim… ▽ More

    Submitted 5 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  19. arXiv:2311.18702  [pdf, other

    cs.CL cs.AI

    CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation

    Authors: Pei Ke, Bosi Wen, Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang

    Abstract: Since the natural language processing (NLP) community started to make large language models (LLMs) act as a critic to evaluate the quality of generated texts, most of the existing works train a critique generation model on the evaluation data labeled by GPT-4's direct prompting. We observe that these models lack the ability to generate informative critiques in both pointwise grading and pairwise c… ▽ More

    Submitted 26 June, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Accepted by ACL 2024 (Main Conference)

  20. arXiv:2310.18552  [pdf, other

    physics.chem-ph cs.CE cs.LG

    The Role of Reference Points in Machine-Learned Atomistic Simulation Models

    Authors: Xiangyun Lei, Weike Ye, Joseph Montoya, Tim Mueller, Linda Hung, Jens Hummelshoej

    Abstract: This paper introduces the Chemical Environment Modeling Theory (CEMT), a novel, generalized framework designed to overcome the limitations inherent in traditional atom-centered Machine Learning Force Field (MLFF) models, widely used in atomistic simulations of chemical systems. CEMT demonstrated enhanced flexibility and adaptability by allowing reference points to exist anywhere within the modeled… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  21. arXiv:2309.16071  [pdf, other

    cs.SI

    Influence Pathway Discovery on Social Media

    Authors: Xinyi Liu, Ruijie Wang, Dachun Sun, **ning Li, Christina Youn, You Lyu, Jianyuan Zhan, Dayou Wu, Xinhe Xu, Mingjun Liu, Xinshuo Lei, Zhihao Xu, Yutong Zhang, Zehao Li, Qikai Yang, Tarek Abdelzaher

    Abstract: This paper addresses influence pathway discovery, a key emerging problem in today's online media. We propose a discovery algorithm that leverages recently published work on unsupervised interpretable ideological embedding, a map** of ideological beliefs (done in a self-supervised fashion) into interpretable low-dimensional spaces. Computing the ideological embedding at scale allows one to analyz… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: This paper is accepted by IEEE CIC as an invited vision paper

  22. arXiv:2309.10993  [pdf, other

    cs.SD cs.HC eess.AS

    Directional Source Separation for Robust Speech Recognition on Smart Glasses

    Authors: Tiantian Feng, Ju Lin, Yiteng Huang, Weipeng He, Kaustubh Kalgaonkar, Niko Moritz, Li Wan, Xin Lei, Ming Sun, Frank Seide

    Abstract: Modern smart glasses leverage advanced audio sensing and machine learning technologies to offer real-time transcribing and captioning services, considerably enriching human experiences in daily communications. However, such systems frequently encounter challenges related to environmental noises, resulting in degradation to speech recognition and speaker change detection. To improve voice quality,… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  23. arXiv:2309.07045  [pdf, other

    cs.CL

    SafetyBench: Evaluating the Safety of Large Language Models

    Authors: Zhexin Zhang, Leqi Lei, Lindong Wu, Rui Sun, Yongkang Huang, Chong Long, Xiao Liu, Xuanyu Lei, Jie Tang, Minlie Huang

    Abstract: With the rapid development of Large Language Models (LLMs), increasing attention has been paid to their safety concerns. Consequently, evaluating the safety of LLMs has become an essential task for facilitating the broad applications of LLMs. Nevertheless, the absence of comprehensive safety evaluation benchmarks poses a significant impediment to effectively assess and enhance the safety of LLMs.… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: ACL 2024 Main Conference

  24. arXiv:2309.04669  [pdf, other

    cs.CV

    Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

    Authors: Yang **, Kun Xu, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu

    Abstract: Recently, the remarkable advance of the Large Language Model (LLM) has inspired researchers to transfer its extraordinary reasoning capability to both vision and language data. However, the prevailing approaches primarily regard the visual input as a prompt and focus exclusively on optimizing the text generation process conditioned upon vision content by a frozen LLM. Such an inequitable treatment… ▽ More

    Submitted 22 March, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: ICLR 2024

  25. arXiv:2309.01947  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

    Authors: Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

    Abstract: Automatic Speech Recognition (ASR) models need to be optimized for specific hardware before they can be deployed on devices. This can be done by tuning the model's hyperparameters or exploring variations in its architecture. Re-training and re-validating models after making these changes can be a resource-intensive task. This paper presents TODM (Train Once Deploy Many), a new approach to efficien… ▽ More

    Submitted 27 November, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Meta AI; Submitted to ICASSP 2024

  26. arXiv:2308.03688  [pdf, other

    cs.AI cs.CL cs.LG

    AgentBench: Evaluating LLMs as Agents

    Authors: Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

    Abstract: Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. As a result, there has been an urgent need to evaluate LLMs as agents on challenging tasks in interactive environments. We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Age… ▽ More

    Submitted 25 October, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: 55 pages

  27. arXiv:2307.15759  [pdf, other

    physics.chem-ph cs.AI cs.CL

    Lessons in Reproducibility: Insights from NLP Studies in Materials Science

    Authors: Xiangyun Lei, Edward Kim, Viktoriia Baibakova, Shi**g Sun

    Abstract: Natural Language Processing (NLP), a cornerstone field within artificial intelligence, has been increasingly utilized in the field of materials science literature. Our study conducts a reproducibility analysis of two pioneering works within this domain: "Machine-learned and codified synthesis parameters of oxide materials" by Kim et al., and "Unsupervised word embeddings capture latent knowledge f… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

  28. arXiv:2307.13314  [pdf, other

    cs.CV

    Mitigating Cross-client GANs-based Attack in Federated Learning

    Authors: Hong Huang, Xinyu Lei, Tao Xiang

    Abstract: Machine learning makes multimedia data (e.g., images) more attractive, however, multimedia data is usually distributed and privacy sensitive. Multiple distributed multimedia clients can resort to federated learning (FL) to jointly learn a global shared model without requiring to share their private samples with any third-party entities. In this paper, we show that FL suffers from the cross-client… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  29. arXiv:2304.09438  [pdf, ps, other

    cs.IT

    Contrastive Learning based Semantic Communication for Wireless Image Transmission

    Authors: Shunpu Tang, Qianqian Yang, Lisheng Fan, Xianfu Lei, Yansha Deng, Arumugam Nallanathan

    Abstract: Recently, semantic communication has been widely applied in wireless image transmission systems as it can prioritize the preservation of meaningful semantic information in images over the accuracy of transmitted symbols, leading to improved communication efficiency. However, existing semantic communication approaches still face limitations in achieving considerable inference performance in downstr… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

  30. arXiv:2303.12980  [pdf, ps, other

    physics.chem-ph cs.CE

    GMP-Featurizer: A parallelized Python package for efficiently computing the Gaussian Multipole features of atomic systems

    Authors: Xiangyun Lei, Joseph Montoya

    Abstract: GMP-Featurizer is a lightweight, accurate, efficient, and scalable software package for calculating the Gaussian Multipole (GMP) features \cite{GMP} for a variety of atomic systems with elements across the periodic table. Starting from the GMP feature computation module from AmpTorch \cite{amptorch}, the capability of GMP-Featurizer has since been greatly improved, including its accuracy and effic… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  31. arXiv:2303.09037  [pdf, other

    cs.RO

    Homography matrix based trajectory planning method for robot uncalibrated visual servoing

    Authors: Zhongtao Fu, Xiaoyu Lei, Xubing Chen, Mohamed Ibrahim Ahmed, Cong Zhang, Miao Li, Tao Huang

    Abstract: In view of the classical visual servoing trajectory planning method which only considers the camera trajectory, this paper proposes one homography matrix based trajectory planning method for robot uncalibrated visual servoing. Taking the robot-end-effector frame as one generic case, eigenvalue decomposition is utilized to calculate the infinite homography matrix of the robot-end-effector trajector… ▽ More

    Submitted 30 August, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  32. arXiv:2302.13596  [pdf, other

    eess.IV cs.CV

    LSR: A Light-Weight Super-Resolution Method

    Authors: Wei Wang, Xue**g Lei, Yueru Chen, Ming-Sui Lee, C. -C. Jay Kuo

    Abstract: A light-weight super-resolution (LSR) method from a single image targeting mobile applications is proposed in this work. LSR predicts the residual image between the interpolated low-resolution (ILR) and high-resolution (HR) images using a self-supervised framework. To lower the computational complexity, LSR does not adopt the end-to-end optimization deep networks. It consists of three modules: 1)… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 8 pages, 3 figures, 10 tables

    ACM Class: I.4.3

  33. arXiv:2302.08341  [pdf

    physics.geo-ph cs.CE

    Enable High-resolution, Real-time Ensemble Simulation and Data Assimilation of Flood Inundation using Distributed GPU Parallelization

    Authors: Junyu Wei, Xiangyu Luo, Weihong Liao, Xiaohui Lei, Jianshi Zhao, Haocheng Huang, Hao Wang

    Abstract: Numerical modeling of the intensity and evolution of flood events are affected by multiple sources of uncertainty such as precipitation and land surface conditions. To quantify and curb these uncertainties, an ensemble-based simulation and data assimilation model for pluvial flood inundation is constructed. The shallow water equation is decoupled in the x and y directions, and the inertial form of… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  34. LibSignal: An Open Library for Traffic Signal Control

    Authors: Hao Mei, Xiaoliang Lei, Longchao Da, Bin Shi, Hua Wei

    Abstract: This paper introduces a library for cross-simulator comparison of reinforcement learning models in traffic signal control tasks. This library is developed to implement recent state-of-the-art reinforcement learning models with extensible interfaces and unified cross-simulator evaluation metrics. It supports commonly-used simulators in traffic signal control tasks, including Simulation of Urban MOb… ▽ More

    Submitted 29 November, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

    Comments: 11 pages + 6 pages appendix. Accepted by Machine Learning Journal (2023). A short version is accepted by NeurIPS 2022 Workshop: Reinforcement Learning for Real Life. Website: https://darl-libsignal.github.io/

  35. arXiv:2211.04635  [pdf, other

    cs.LG cs.AI eess.AS

    LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting

    Authors: Haichuan Yang, Zhaojun Yang, Li Wan, Biqiao Zhang, Yangyang Shi, Yiteng Huang, Ivaylo Enchev, Limin Tang, Raziel Alvarez, Ming Sun, Xin Lei, Raghuraman Krishnamoorthi, Vikas Chandra

    Abstract: This paper proposes a hardware-efficient architecture, Linearized Convolution Network (LiCo-Net) for keyword spotting. It is optimized specifically for low-power processor units like microcontrollers. ML operators exhibit heterogeneous efficiency profiles on power-efficient hardware. Given the exact theoretical computation cost, int8 operators are more computation-effective than float operators, a… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

  36. arXiv:2211.00589  [pdf, other

    eess.AS cs.SD eess.SP

    SCA: Streaming Cross-attention Alignment for Echo Cancellation

    Authors: Yang Liu, Yangyang Shi, Yun Li, Kaustubh Kalgaonkar, Sriram Srinivasan, Xin Lei

    Abstract: End-to-End deep learning has shown promising results for speech enhancement tasks, such as noise suppression, dereverberation, and speech separation. However, most state-of-the-art methods for echo cancellation are either classical DSP-based or hybrid DSP-ML algorithms. Components such as the delay estimator and adaptive linear filter are based on traditional signal processing concepts, and deep l… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  37. arXiv:2210.08861  [pdf, other

    cs.IT cs.LG

    A Unitary Transform Based Generalized Approximate Message Passing

    Authors: Jiang Zhu, Xiangming Meng, Xupeng Lei, Qinghua Guo

    Abstract: We consider the problem of recovering an unknown signal ${\mathbf x}\in {\mathbb R}^n$ from general nonlinear measurements obtained through a generalized linear model (GLM), i.e., ${\mathbf y}= f\left({\mathbf A}{\mathbf x}+{\mathbf w}\right)$, where $f(\cdot)$ is a componentwise nonlinear function. Based on the unitary transform approximate message passing (UAMP) and expectation propagation, a un… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: 5 pages, 3 figures

  38. arXiv:2210.03689  [pdf, ps, other

    eess.IV cs.CV

    GENHOP: An Image Generation Method Based on Successive Subspace Learning

    Authors: Xue**g Lei, Wei Wang, C. -C. Jay Kuo

    Abstract: Being different from deep-learning-based (DL-based) image generation methods, a new image generative model built upon successive subspace learning principle is proposed and named GenHop (an acronym of Generative PixelHop) in this work. GenHop consists of three modules: 1) high-to-low dimension reduction, 2) seed image generation, and 3) low-to-high dimension expansion. In the first module, it buil… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: 10 pages, 5 figures, accepted by ISCAS 2022

  39. arXiv:2210.02445  [pdf, other

    eess.IV cs.CV cs.LG

    Localizing Anatomical Landmarks in Ocular Images using Zoom-In Attentive Networks

    Authors: Xiaofeng Lei, Shaohua Li, Xinxing Xu, Huazhu Fu, Yong Liu, Yih-Chung Tham, Yangqin Feng, Mingrui Tan, Yanyu Xu, Jocelyn Hui Lin Goh, Rick Siow Mong Goh, Ching-Yu Cheng

    Abstract: Localizing anatomical landmarks are important tasks in medical image analysis. However, the landmarks to be localized often lack prominent visual features. Their locations are elusive and easily confused with the background, and thus precise localization highly depends on the context formed by their surrounding areas. In addition, the required precision is usually higher than segmentation and obje… ▽ More

    Submitted 22 December, 2022; v1 submitted 25 September, 2022; originally announced October 2022.

  40. arXiv:2209.08539  [pdf, other

    cs.RO

    Dynamic Control Barrier Function-based Model Predictive Control to Safety-Critical Obstacle-Avoidance of Mobile Robot

    Authors: Zhuozhu Jian, Zihong Yan, Xuanang Lei, Zihong Lu, Bin Lan, Xueqian Wang, Bin Liang

    Abstract: This paper presents an efficient and safe method to avoid static and dynamic obstacles based on LiDAR. First, point cloud is used to generate a real-time local grid map for obstacle detection. Then, obstacles are clustered by DBSCAN algorithm and enclosed with minimum bounding ellipses (MBEs). In addition, data association is conducted to match each MBE with the obstacle in the current frame. Cons… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: Submitted to IEEE International Conference on Robotics and Automation (ICRA) 2023

  41. Modeling Network-level Traffic Flow Transitions on Sparse Data

    Authors: Xiaoliang Lei, Hao Mei, Bin Shi, Hua Wei

    Abstract: Modeling how network-level traffic flow changes in the urban environment is useful for decision-making in transportation, public safety and urban planning. The traffic flow system can be viewed as a dynamic process that transits between states (e.g., traffic volumes on each road segment) over time. In the real-world traffic system with traffic operation actions like traffic signal control or rever… ▽ More

    Submitted 19 November, 2022; v1 submitted 13 August, 2022; originally announced August 2022.

    Comments: 9 pages + 3 pages appendix. Accepted by SIGKDD 2022

    ACM Class: I.2.0

  42. arXiv:2207.11447  [pdf, other

    cs.LG

    Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion

    Authors: Xu Zhou, Xinyu Lei, Cong Yang, Yichun Shi, Xiao Zhang, **gwen Shi

    Abstract: Federated learning (FL) supports distributed training of a global machine learning model across multiple devices with the help of a central server. However, data heterogeneity across different devices leads to the client model drift issue and results in model performance degradation and poor model fairness. To address the issue, we design Federated learning with global-local Knowledge Fusion (FedK… ▽ More

    Submitted 4 October, 2023; v1 submitted 23 July, 2022; originally announced July 2022.

    Comments: 15 pages, 3 figures

  43. KinD-LCE Curve Estimation And Retinex Fusion On Low-Light Image

    Authors: Xiaochun Lei, Weiliang Mai, Junlin Xie, He Liu, Zetao Jiang, Zhaoting Gong, Chang Lu, Linjun Lu

    Abstract: Low-light images often suffer from noise and color distortion. Object detection, semantic segmentation, instance segmentation, and other tasks are challenging when working with low-light images because of image noise and chromatic aberration. We also found that the conventional Retinex theory loses information in adjusting the image for low-light tasks. In response to the aforementioned problem, t… ▽ More

    Submitted 23 October, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted by Signal, Image and Video Processing

  44. arXiv:2206.10062  [pdf, other

    cs.RO cs.AI

    Early Recall, Late Precision: Multi-Robot Semantic Object Map** under Operational Constraints in Perceptually-Degraded Environments

    Authors: Xianmei Lei, Taeyeon Kim, Nicolas Marchal, Daniel Pastor, Barry Ridge, Frederik Schöller, Edward Terry, Fernando Chavez, Thomas Touma, Kyohei Otsu, Ali Agha

    Abstract: Semantic object map** in uncertain, perceptually degraded environments during long-range multi-robot autonomous exploration tasks such as search-and-rescue is important and challenging. During such missions, high recall is desirable to avoid missing true target objects and high precision is also critical to avoid wasting valuable operational time on false positives. Given recent advancements in… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  45. arXiv:2205.04639  [pdf, other

    cs.CV

    STDC-MA Network for Semantic Segmentation

    Authors: Xiaochun Lei, Linjun Lu, Zetao Jiang, Zhaoting Gong, Chang Lu, Jiaming Liang

    Abstract: Semantic segmentation is applied extensively in autonomous driving and intelligent transportation with methods that highly demand spatial and semantic information. Here, an STDC-MA network is proposed to meet these demands. First, the STDC-Seg structure is employed in STDC-MA to ensure a lightweight and efficient structure. Subsequently, the feature alignment module (FAM) is applied to understand… ▽ More

    Submitted 10 May, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: 10 pages, 5 figures

    MSC Class: 68U10 ACM Class: I.4.0

  46. arXiv:2205.04638  [pdf, other

    cs.CV

    Using Frequency Attention to Make Adversarial Patch Powerful Against Person Detector

    Authors: Xiaochun Lei, Chang Lu, Zetao Jiang, Zhaoting Gong, Xiang Cai, Linjun Lu

    Abstract: Deep neural networks (DNNs) are vulnerable to adversarial attacks. In particular, object detectors may be attacked by applying a particular adversarial patch to the image. However, because the patch shrinks during preprocessing, most existing approaches that employ adversarial patches to attack object detectors would diminish the attack success rate on small and medium targets. This paper proposes… ▽ More

    Submitted 11 May, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: 10pages, 4 figures

    MSC Class: 68U10 ACM Class: I.4.0

  47. arXiv:2204.08244  [pdf, ps, other

    cs.IT

    RIS-Assisted Cooperative NOMA with SWIPT

    Authors: Juanjuan Ren, Xianfu Lei, Zhangjie Peng, Xiaohu Tang, Octavia A. Dobre

    Abstract: This paper studies the application of reconfigurable intelligent surface (RIS) to cooperative non-orthogonal multiple access (C-NOMA) networks with simultaneous wireless information and power transfer (SWIPT). We aim for maximizing the rate of the strong user with guaranteed weak user's quality of service (QoS) by jointly optimizing power splitting factors, beamforming coefficients, and RIS reflec… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

  48. arXiv:2202.13855  [pdf, other

    cs.CV

    Large-Scale 3D Semantic Reconstruction for Automated Driving Vehicles with Adaptive Truncated Signed Distance Function

    Authors: Haohao Hu, Hexing Yang, Jian Wu, Xiao Lei, Frank Bieder, Jan-Hendrik Pauls, Christoph Stiller

    Abstract: The Large-scale 3D reconstruction, texturing and semantic map** are nowadays widely used for automated driving vehicles, virtual reality and automatic data generation. However, most approaches are developed for RGB-D cameras with colored dense point clouds and not suitable for large-scale outdoor environments using sparse LiDAR point clouds. Since a 3D surface can be usually observed from multip… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: 8 pages

  49. arXiv:2202.11958  [pdf, other

    cs.AI

    Cognitive Semantic Communication Systems Driven by Knowledge Graph

    Authors: Fuhui Zhou, Yihao Li, Xinyuan Zhang, Qihui Wu, Xianfu Lei, Rose Qingyang Hu

    Abstract: Semantic communication is envisioned as a promising technique to break through the Shannon limit. However, the existing semantic communication frameworks do not involve inference and error correction, which limits the achievable performance. In this paper, in order to tackle this issue, a cognitive semantic communication framework is proposed by exploiting knowledge graph. Moreover, a simple, gene… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  50. arXiv:2202.05612  [pdf, other

    stat.ML cs.LG math.ST

    High-dimensional Inference and FDR Control for Simulated Markov Random Fields

    Authors: Haoyu Wei, Xiaoyu Lei, Yixin Han, Huiming Zhang

    Abstract: Identifying important features linked to a response variable is a fundamental task in various scientific domains. This article explores statistical inference for simulated Markov random fields in high-dimensional settings. We introduce a methodology based on Markov Chain Monte Carlo Maximum Likelihood Estimation (MCMC-MLE) with Elastic-net regularization. Under mild conditions on the MCMC method,… ▽ More

    Submitted 19 January, 2024; v1 submitted 11 February, 2022; originally announced February 2022.