Skip to main content

Showing 1–50 of 176 results for author: Fan, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15252  [pdf, other

    cs.CV cs.AI

    VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

    Authors: Xuan He, Dongfu Jiang, Ge Zhang, Max Ku, Achint Soni, Sherman Siu, Haonan Chen, Abhranil Chandra, Ziyan Jiang, Aaran Arulraj, Kai Wang, Quy Duc Do, Yuansheng Ni, Bohan Lyu, Yaswanth Narsupalli, Rongqi Fan, Zhiheng Lyu, Yuchen Lin, Wenhu Chen

    Abstract: The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. The main barrier is the lack of large-scale human-annotated dataset. In this paper, we release VideoFeedback, the first large-scale dataset containing human-prov… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2406.15222  [pdf

    eess.IV cs.AI cs.CV

    Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

    Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, **gyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

    Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: under peer review

  3. arXiv:2406.12753  [pdf, other

    cs.CL cs.AI

    OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

    Authors: Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang , et al. (3 additional authors not shown)

    Abstract: The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoni… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 44 pages

  4. arXiv:2406.10512  [pdf, other

    eess.AS cs.SD

    SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR

    Authors: Natarajan Balaji Shankar, Ruchao Fan, Abeer Alwan

    Abstract: Recently, speech foundation models have gained popularity due to their superiority in finetuning downstream ASR tasks. However, models finetuned on certain domains, such as LibriSpeech (adult read speech), behave poorly on other domains (child or noisy speech). One solution could be collecting as much labeled and diverse data as possible for joint finetuning on various domains. However, collecting… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted to ICASSP 2024 SASB Workshop

  5. arXiv:2406.10507  [pdf, other

    eess.AS cs.CL cs.SD

    Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models

    Authors: Ruchao Fan, Natarajan Balaji Shankar, Abeer Alwan

    Abstract: Speech foundation models (SFMs) have achieved state-of-the-art results for various speech tasks in supervised (e.g. Whisper) or self-supervised systems (e.g. WavLM). However, the performance of SFMs for child ASR has not been systematically studied. In addition, there is no benchmark for child ASR with standard evaluations, making the comparisons of novel ideas difficult. In this paper, we initiat… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear in Interspeech 2024

  6. arXiv:2406.04485  [pdf, other

    cs.AI cs.CV

    GenAI Arena: An Open Evaluation Platform for Generative Models

    Authors: Dongfu Jiang, Max Ku, Tianle Li, Yuansheng Ni, Shizhuo Sun, Rongqi Fan, Wenhu Chen

    Abstract: Generative AI has made remarkable strides to revolutionize fields such as image and video generation. These advancements are driven by innovative algorithms, architecture, and data. However, the rapid proliferation of generative models has highlighted a critical gap: the absence of trustworthy evaluation metrics. Current automatic assessments such as FID, CLIP, FVD, etc often fail to capture the n… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 9 pages,7 figures

  7. arXiv:2406.01574  [pdf, other

    cs.CL

    MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

    Authors: Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, Wenhu Chen

    Abstract: In the age of large-scale language models, benchmarks like the Massive Multitask Language Understanding (MMLU) have been pivotal in pushing the boundaries of what AI can achieve in language comprehension and reasoning across diverse domains. However, as models continue to improve, their performance on these benchmarks has begun to plateau, making it increasingly difficult to discern differences in… ▽ More

    Submitted 23 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  8. arXiv:2405.17079  [pdf, other

    stat.ML cs.LG

    Learning with User-Level Local Differential Privacy

    Authors: Puning Zhao, Li Shen, Rongfei Fan, Qingming Li, Huiwen Wu, Jiafei Wu, Zhe Liu

    Abstract: User-level privacy is important in distributed systems. Previous research primarily focuses on the central model, while the local models have received much less attention. Under the central model, user-level DP is strictly stronger than the item-level one. However, under the local model, the relationship between user-level and item-level LDP becomes more complex, thus the analysis is crucially dif… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  9. arXiv:2405.16960  [pdf, other

    cs.CV cs.RO

    DCPI-Depth: Explicitly Infusing Dense Correspondence Prior to Unsupervised Monocular Depth Estimation

    Authors: Mengtan Zhang, Yi Feng, Qijun Chen, Rui Fan

    Abstract: There has been a recent surge of interest in learning to perceive depth from monocular videos in an unsupervised fashion. A key challenge in this field is achieving robust and accurate depth estimation in challenging scenarios, particularly in regions with weak textures or where dynamic objects are present. This study makes three major contributions by delving deeply into dense correspondence prio… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 13 pages, 7 figures

  10. arXiv:2405.15150  [pdf, other

    cs.LG

    Enhancing Learning with Label Differential Privacy by Vector Approximation

    Authors: Puning Zhao, Rongfei Fan, Huiwen Wu, Qingming Li, Jiafei Wu, Zhe Liu

    Abstract: Label differential privacy (DP) is a framework that protects the privacy of labels in training datasets, while the feature vectors are public. Existing approaches protect the privacy of labels by flip** them randomly, and then train a model to make the output approximate the privatized label. However, as the number of classes $K$ increases, stronger randomization is needed, thus the performances… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  11. arXiv:2405.10489  [pdf, other

    cs.CV

    MixCut:A Data Augmentation Method for Facial Expression Recognition

    Authors: Jiaxiang Yu, Yiyang Liu, Ruiyang Fan, Guobing Sun

    Abstract: In the facial expression recognition task, researchers always get low accuracy of expression classification due to a small amount of training samples. In order to solve this kind of problem, we proposes a new data augmentation method named MixCut. In this method, we firstly interpolate the two original training samples at the pixel level in a random ratio to generate new samples. Then, pixel remov… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  12. arXiv:2405.09552  [pdf, other

    eess.IV cs.AI cs.CV

    ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

    Authors: Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

    Abstract: Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose seman… ▽ More

    Submitted 2 June, 2024; v1 submitted 15 April, 2024; originally announced May 2024.

  13. arXiv:2405.07966  [pdf, other

    cs.CV cs.AI

    OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

    Authors: Qiuchi Xiang, **tao Cheng, Jiehao Luo, ** Wu, Rui Fan, Xieyuanli Chen, Xiaoyu Tang

    Abstract: Place recognition is the foundation for enabling autonomous systems to achieve independent decision-making and safe operations. It is also crucial in tasks such as loop closure detection and global localization within SLAM. Previous methods utilize mundane point cloud representations as input and deep learning-based LiDAR-based Place Recognition (LPR) approaches employing different point cloud ima… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  14. arXiv:2404.18824  [pdf, other

    cs.CL cs.AI cs.LG

    Benchmarking Benchmark Leakage in Large Language Models

    Authors: Ruijie Xu, Zengzhi Wang, Run-Ze Fan, Pengfei Liu

    Abstract: Amid the expanding use of pre-training data, the phenomenon of benchmark dataset leakage has become increasingly prominent, exacerbated by opaque training processes and the often undisclosed inclusion of supervised data in contemporary Large Language Models (LLMs). This issue skews benchmark effectiveness and fosters potentially unfair comparisons, impeding the field's healthy development. To addr… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 30 pages; Homepage: https://gair-nlp.github.io/benbench

  15. arXiv:2404.18083  [pdf, other

    cs.RO cs.AI cs.CV

    Online,Target-Free LiDAR-Camera Extrinsic Calibration via Cross-Modal Mask Matching

    Authors: Zhiwei Huang, Yikang Zhang, Qijun Chen, Rui Fan

    Abstract: LiDAR-camera extrinsic calibration (LCEC) is crucial for data fusion in intelligent vehicles. Offline, target-based approaches have long been the preferred choice in this field. However, they often demonstrate poor adaptability to real-world environments. This is largely because extrinsic parameters may change significantly due to moderate shocks or during extended operations in environments with… ▽ More

    Submitted 19 June, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: accepted to IEEE Trans. on Intelligent Vehicles (T-IV)

  16. arXiv:2404.06261  [pdf, other

    cs.CV cs.AI cs.RO

    Playing to Vision Foundation Model's Strengths in Stereo Matching

    Authors: Chuang-Wei Liu, Qijun Chen, Rui Fan

    Abstract: Stereo matching has become a key technique for 3D environment perception in intelligent vehicles. For a considerable time, convolutional neural networks (CNNs) have remained the mainstream choice for feature extraction in this domain. Nonetheless, there is a growing consensus that the existing paradigm should evolve towards vision foundation models (VFM), particularly those developed based on visi… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  17. arXiv:2404.03527  [pdf, other

    cs.CV

    HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion

    Authors: Jiahang Li, Peng Yun, Qijun Chen, Rui Fan

    Abstract: Data-fusion networks have shown significant promise for RGB-thermal scene parsing. However, the majority of existing studies have relied on symmetric duplex encoders for heterogeneous feature extraction and fusion, paying inadequate attention to the inherent differences between RGB and thermal modalities. Recent progress in vision foundation models (VFMs) trained through self-supervision on vast a… ▽ More

    Submitted 6 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4figures

  18. arXiv:2403.19943  [pdf, other

    cs.LG cs.AI eess.SP

    TDANet: A Novel Temporal Denoise Convolutional Neural Network With Attention for Fault Diagnosis

    Authors: Zhongzhi Li, Rong Fan, **gqi Tu, **yi Ma, Jianliang Ai, Yiqun Dong

    Abstract: Fault diagnosis plays a crucial role in maintaining the operational integrity of mechanical systems, preventing significant losses due to unexpected failures. As intelligent manufacturing and data-driven approaches evolve, Deep Learning (DL) has emerged as a pivotal technique in fault diagnosis research, recognized for its ability to autonomously extract complex features. However, the practical ap… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  19. arXiv:2403.13374  [pdf, other

    cs.LG cs.AI cs.CR

    Byzantine-resilient Federated Learning With Adaptivity to Data Heterogeneity

    Authors: Shiyuan Zuo, Xingrun Yan, Rongfei Fan, Han Hu, Hangguan Shan, Tony Q. S. Quek

    Abstract: This paper deals with federated learning (FL) in the presence of malicious Byzantine attacks and data heterogeneity. A novel Robust Average Gradient Algorithm (RAGA) is proposed, which leverages the geometric median for aggregation and can freely select the round number for local updating. Different from most existing resilient approaches, which perform convergence analysis based on strongly-conve… ▽ More

    Submitted 27 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  20. arXiv:2403.08270  [pdf, ps, other

    cs.CV

    Identity-aware Dual-constraint Network for Cloth-Changing Person Re-identification

    Authors: Peini Guo, Mengyuan Liu, Hong Liu, Ruijia Fan, Guoquan Wang, Bin He

    Abstract: Cloth-Changing Person Re-Identification (CC-ReID) aims to accurately identify the target person in more realistic surveillance scenarios, where pedestrians usually change their clothing. Despite great progress, limited cloth-changing training samples in existing CC-ReID datasets still prevent the model from adequately learning cloth-irrelevant features. In addition, due to the absence of explicit… ▽ More

    Submitted 26 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  21. arXiv:2403.08215  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving

    Authors: Sicen Guo, Zhiyuan Wu, Qijun Chen, Ioannis Pitas, Rui Fan

    Abstract: Despite the impressive performance achieved by data-fusion networks with duplex encoders for visual semantic segmentation, they become ineffective when spatial geometric data are not available. Implicitly infusing the spatial geometric prior knowledge acquired by a duplex-encoder teacher model into a single-encoder student model is a practical, albeit less explored research avenue. This paper delv… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 13 pages, 4 figures, 5 tables

  22. arXiv:2403.05500  [pdf, other

    cs.RO

    Using Fiber Optic Bundles to Miniaturize Vision-Based Tactile Sensors

    Authors: Julia Di, Zdravko Dugonjic, Will Fu, Tingfan Wu, Romeo Mercado, Kevin Sawyer, Victoria Rose Most, Gregg Kammerer, Stefanie Speidel, Richard E. Fan, Geoffrey Sonn, Mark R. Cutkosky, Mike Lambeta, Roberto Calandra

    Abstract: Vision-based tactile sensors have recently become popular due to their combination of low cost, very high spatial resolution, and ease of integration using widely available miniature cameras. The associated field of view and focal length, however, are difficult to package in a human-sized finger. In this paper we employ optical fiber bundles to achieve a form factor that, at 15 mm diameter, is sma… ▽ More

    Submitted 11 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: We open source the design of DIGIT Pinki at https://github.com/facebookresearch/digit-design

  23. arXiv:2403.05388  [pdf, other

    cs.CV

    Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation

    Authors: Yu Han, Ziwei Long, Yanting Zhang, ** Wu, Zhijun Fang, Rui Fan

    Abstract: Correspondence matching plays a crucial role in numerous robotics applications. In comparison to conventional hand-crafted methods and recent data-driven approaches, there is significant interest in plug-and-play algorithms that make full use of pre-trained backbone networks for multi-scale feature extraction and leverage hierarchical refinement strategies to generate matched correspondences. The… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  24. arXiv:2402.18918  [pdf, other

    cs.CV

    SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection

    Authors: Yi Feng, Yu Ma, Qijun Chen, Ioannis Pitas, Rui Fan

    Abstract: Feature-fusion networks with duplex encoders have proven to be an effective technique to solve the freespace detection problem. However, despite the compelling results achieved by previous research efforts, the exploration of adequate and discriminative heterogeneous feature fusion, as well as the development of fallibility-aware loss functions remains relatively scarce. This paper makes several s… ▽ More

    Submitted 29 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  25. arXiv:2402.13499  [pdf, other

    cs.AR

    Benchmarking and Dissecting the Nvidia Hopper GPU Architecture

    Authors: Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Qiang Wang, Xiaowen Chu

    Abstract: Graphics processing units (GPUs) are continually evolving to cater to the computational demands of contemporary general-purpose workloads, particularly those driven by artificial intelligence (AI) utilizing deep learning techniques. A substantial body of studies have been dedicated to dissecting the microarchitectural metrics characterizing diverse GPU generations, which helps researchers understa… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  26. arXiv:2402.12641  [pdf, other

    cs.CV

    YOLO-Ant: A Lightweight Detector via Depthwise Separable Convolutional and Large Kernel Design for Antenna Interference Source Detection

    Authors: Xiaoyu Tang, Xingming Chen, **tao Cheng, ** Wu, Rui Fan, Chengxi Zhang, Zebo Zhou

    Abstract: In the era of 5G communication, removing interference sources that affect communication is a resource-intensive task. The rapid development of computer vision has enabled unmanned aerial vehicles to perform various high-altitude detection tasks. Because the field of object detection for antenna interference sources has not been fully explored, this industry lacks dedicated learning samples and det… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  27. arXiv:2402.12219  [pdf, other

    cs.CL cs.AI cs.LG

    Reformatted Alignment

    Authors: Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu

    Abstract: The quality of finetuning data is crucial for aligning large language models (LLMs) with human values. Current methods to improve data quality are either labor-intensive or prone to factual errors caused by LLM hallucinations. This paper explores elevating the quality of existing instruction data to better align with human values, introducing a simple and effective approach named ReAlign, which re… ▽ More

    Submitted 17 April, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Homepage: https://gair-nlp.github.io/ReAlign/

  28. arXiv:2402.08898  [pdf, other

    eess.AS cs.CL cs.SD

    UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models

    Authors: Ruchao Fan, Natarajan Balaji Shanka, Abeer Alwan

    Abstract: Non-autoregressive automatic speech recognition (NASR) models have gained attention due to their parallelism and fast inference. The encoder-based NASR, e.g. connectionist temporal classification (CTC), can be initialized from the speech foundation models (SFM) but does not account for any dependencies among intermediate tokens. The encoder-decoder-based NASR, like CTC alignment-based single-step… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Published in IEEE Signal Processing Letters

  29. arXiv:2401.17023  [pdf, other

    cs.CV

    MF-MOS: A Motion-Focused Model for Moving Object Segmentation

    Authors: **tao Cheng, Kang Zeng, Zhuoxu Huang, Xiaoyu Tang, ** Wu, Chengxi Zhang, Xieyuanli Chen, Rui Fan

    Abstract: Moving object segmentation (MOS) provides a reliable solution for detecting traffic participants and thus is of great interest in the autonomous driving field. Dynamic capture is always critical in the MOS problem. Previous methods capture motion features from the range images directly. Differently, we argue that the residual maps provide greater potential for motion information, while range image… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted by ICRA2024

  30. arXiv:2401.15647  [pdf, other

    cs.CV cs.AI eess.IV

    UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration

    Authors: Nachuan Ma, Rui Fan, Lihua Xie

    Abstract: Over the past decade, automated methods have been developed to detect cracks more efficiently, accurately, and objectively, with the ultimate goal of replacing conventional manual visual inspection techniques. Among these methods, semantic segmentation algorithms have demonstrated promising results in pixel-wise crack detection tasks. However, training such networks requires a large amount of huma… ▽ More

    Submitted 6 May, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  31. arXiv:2401.11414  [pdf, other

    cs.CV cs.AI cs.RO

    S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

    Authors: Zhiyuan Wu, Yi Feng, Chuang-Wei Liu, Fisher Yu, Qijun Chen, Rui Fan

    Abstract: Semantic segmentation and stereo matching are two essential components of 3D environmental perception systems for autonomous driving. Nevertheless, conventional approaches often address these two problems independently, employing separate models for each task. This approach poses practical limitations in real-world scenarios, particularly when computational resources are scarce or real-time perfor… ▽ More

    Submitted 28 January, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: accepted to IEEE Trans. on Intelligent Vehicles (T-IV)

  32. arXiv:2401.09455  [pdf, other

    cs.NI cs.AI cs.LG eess.SY

    Dynamic Routing for Integrated Satellite-Terrestrial Networks: A Constrained Multi-Agent Reinforcement Learning Approach

    Authors: Yifeng Lyu, Han Hu, Rongfei Fan, Zhi Liu, Jian** An, Shiwen Mao

    Abstract: The integrated satellite-terrestrial network (ISTN) system has experienced significant growth, offering seamless communication services in remote areas with limited terrestrial infrastructure. However, designing a routing scheme for ISTN is exceedingly difficult, primarily due to the heightened complexity resulting from the inclusion of additional ground stations, along with the requirement to sat… ▽ More

    Submitted 22 December, 2023; originally announced January 2024.

  33. arXiv:2312.15650  [pdf, ps, other

    cs.IT

    A Hybrid Advertising Mode for Device Discovery in Bluetooth Low Energy Networks

    Authors: Zhong Shen, Hai Jiang, Rongfei Fan, Hongxing Guo

    Abstract: Device discovery has a great impact on the performance of Bluetooth low energy (BLE). The performance of device discovery is highly related to the advertising mode. BLE has two advertising modes: pseudo-random delay advertising (RDA) and periodic deterministic advertising (PDA). Generally, PDA has low discovery latency but is susceptible to persistent collisions, whereas RDA does not suffer persis… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  34. arXiv:2312.11571  [pdf, other

    cs.CR cs.AI cs.LG

    Model Stealing Attack against Recommender System

    Authors: Zhihao Zhu, Rui Fan, Chenwang Wu, Yi Yang, Defu Lian, Enhong Chen

    Abstract: Recent studies have demonstrated the vulnerability of recommender systems to data privacy attacks. However, research on the threat to model privacy in recommender systems, such as model stealing attacks, is still in its infancy. Some adversarial attacks have achieved model stealing attacks against recommender systems, to some extent, by collecting abundant training data of the target model (target… ▽ More

    Submitted 26 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

  35. arXiv:2312.11035  [pdf, other

    cs.CV

    Towards Effective Multi-Moving-Camera Tracking: A New Dataset and Lightweight Link Model

    Authors: Yanting Zhang, Shuanghong Wang, Qingxiang Wang, Cairong Yan, Rui Fan

    Abstract: Ensuring driving safety for autonomous vehicles has become increasingly crucial, highlighting the need for systematic tracking of on-road pedestrians. Most vehicles are equipped with visual sensors, however, the large-scale visual data has not been well studied yet. Multi-target multi-camera (MTMC) tracking systems are composed of two modules: single-camera tracking (SCT) and inter-camera tracking… ▽ More

    Submitted 23 April, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  36. arXiv:2312.10943  [pdf, other

    cs.LG cs.CR

    Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity

    Authors: Zhihao Zhu, Chenwang Wu, Rui Fan, Yi Yang, Defu Lian, Enhong Chen

    Abstract: Recent research demonstrates that GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. However, they mainly focus on node classification tasks, neglecting the potential threats entailed within the domain of graph classification tasks. Furthermore, their practicality is questionable due to unreasonable assumptions,… ▽ More

    Submitted 26 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

  37. arXiv:2312.10466  [pdf, other

    cs.CL cs.AI cs.IR

    RIGHT: Retrieval-augmented Generation for Mainstream Hashtag Recommendation

    Authors: Run-Ze Fan, Yixing Fan, Jiangui Chen, Jiafeng Guo, Ruqing Zhang, Xueqi Cheng

    Abstract: Automatic mainstream hashtag recommendation aims to accurately provide users with concise and popular topical hashtags before publication. Generally, mainstream hashtag recommendation faces challenges in the comprehensive difficulty of newly posted tweets in response to new topics, and the accurate identification of mainstream hashtags beyond semantic correctness. However, previous retrieval-based… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted by ECIR2024 full paper

  38. arXiv:2312.07964  [pdf, other

    cs.RO cs.CV

    Three-Filters-to-Normal+: Revisiting Discontinuity Discrimination in Depth-to-Normal Translation

    Authors: **gwei Yang, Bohuan Xue, Yi Feng, Deming Wang, Rui Fan, Qijun Chen

    Abstract: This article introduces three-filters-to-normal+ (3F2N+), an extension of our previous work three-filters-to-normal (3F2N), with a specific focus on incorporating discontinuity discrimination capability into surface normal estimators (SNEs). 3F2N+ achieves this capability by utilizing a novel discontinuity discrimination module (DDM), which combines depth curvature minimization and correlation coe… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  39. arXiv:2312.06550  [pdf, other

    cs.CL cs.AI cs.LG

    LLM360: Towards Fully Transparent Open-Source LLMs

    Authors: Zhengzhong Liu, Aurick Qiao, Willie Neiswanger, Hongyi Wang, Bowen Tan, Tianhua Tao, Junbo Li, Yuqi Wang, Suqi Sun, Omkar Pangarkar, Richard Fan, Yi Gu, Victor Miller, Yonghao Zhuang, Guowei He, Haonan Li, Fajri Koto, Li** Tang, Nikhil Ranjan, Zhiqiang Shen, Xuguang Ren, Roberto Iriondo, Cun Mu, Zhiting Hu, Mark Schulze , et al. (3 additional authors not shown)

    Abstract: The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics. These choices hinder prog… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  40. arXiv:2312.05334  [pdf, other

    eess.IV cs.CV

    ProsDectNet: Bridging the Gap in Prostate Cancer Detection via Transrectal B-mode Ultrasound Imaging

    Authors: Sulaiman Vesal, Indrani Bhattacharya, Hassan Jahanandish, Xinran Li, Zachary Kornberg, Steve Ran Zhou, Elijah Richard Sommer, Moon Hyung Choi, Richard E. Fan, Geoffrey A. Sonn, Mirabela Rusu

    Abstract: Interpreting traditional B-mode ultrasound images can be challenging due to image artifacts (e.g., shadowing, speckle), leading to low sensitivity and limited diagnostic accuracy. While Magnetic Resonance Imaging (MRI) has been proposed as a solution, it is expensive and not widely available. Furthermore, most biopsies are guided by Transrectal Ultrasound (TRUS) alone and can miss up to 52% cancer… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted in NeurIPS 2023 (Medical Imaging meets NeurIPS Workshop)

  41. arXiv:2311.17317  [pdf

    cs.SE

    Digital Twins for Logistics and Supply Chain Systems: Literature Review, Conceptual Framework, Research Potential, and Practical Challenges

    Authors: Tho V. Le, Ruoling Fan

    Abstract: To facilitate an effective, efficient, transparent, and timely decision-making process as well as to provide guidelines for industry planning and public policy development, a conceptual framework of digital twins (DTs) for logistics and supply chain systems (LSCS) is needed. This paper first introduces the background of the logistics and supply chain industry, the DT and its potential benefits, an… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 45 pages

  42. arXiv:2311.03687  [pdf, other

    cs.PF cs.CL cs.LG

    Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models

    Authors: Longteng Zhang, Xiang Liu, Zeyu Li, Xinglin Pan, Peijie Dong, Ruibo Fan, Rui Guo, Xin Wang, Qiong Luo, Shaohuai Shi, Xiaowen Chu

    Abstract: Large Language Models (LLMs) have seen great advance in both academia and industry, and their popularity results in numerous open-source frameworks and techniques in accelerating LLM pre-training, fine-tuning, and inference. Training and deploying LLMs are expensive as it requires considerable computing resources and memory, hence many efficient approaches have been developed for improving system… ▽ More

    Submitted 1 December, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  43. arXiv:2310.09832  [pdf, other

    cs.CL

    Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

    Authors: Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, Dacheng Tao

    Abstract: Scaling the size of language models usually leads to remarkable advancements in NLP tasks. But it often comes with a price of growing computational cost. Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e.g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its pract… ▽ More

    Submitted 21 November, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Conference (Oral)

  44. arXiv:2310.05470  [pdf, other

    cs.CL cs.AI

    Generative Judge for Evaluating Alignment

    Authors: Junlong Li, Shichao Sun, Weizhe Yuan, Run-Ze Fan, Hai Zhao, Pengfei Liu

    Abstract: The rapid development of Large Language Models (LLMs) has substantially expanded the range of tasks they can address. In the field of Natural Language Processing (NLP), researchers have shifted their focus from conventional NLP tasks (e.g., sequence tagging and parsing) towards tasks that revolve around aligning with human needs (e.g., brainstorming and email writing). This shift in task distribut… ▽ More

    Submitted 7 December, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Fix typos in Table 1

  45. arXiv:2310.02638  [pdf, other

    cs.CV

    P2CADNet: An End-to-End Reconstruction Network for Parametric 3D CAD Model from Point Clouds

    Authors: Zhihao Zong, Fazhi He, Rubin Fan, Yuxin Liu

    Abstract: Computer Aided Design (CAD), especially the feature-based parametric CAD, plays an important role in modern industry and society. However, the reconstruction of featured CAD model is more challenging than the reconstruction of other CAD models. To this end, this paper proposes an end-to-end network to reconstruct featured CAD model from point cloud (P2CADNet). Initially, the proposed P2CADNet arch… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  46. RoadFormer: Duplex Transformer for RGB-Normal Semantic Road Scene Parsing

    Authors: Jiahang Li, Yikang Zhang, Peng Yun, Guangliang Zhou, Qijun Chen, Rui Fan

    Abstract: The recent advancements in deep convolutional neural networks have shown significant promise in the domain of road scene parsing. Nevertheless, the existing works focus primarily on freespace detection, with little attention given to hazardous road defects that could compromise both driving safety and comfort. In this paper, we introduce RoadFormer, a novel Transformer-based data-fusion network de… ▽ More

    Submitted 1 July, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: 10 pages 7 figures. Accepted by Transactions on Intelligent Vehicles

  47. arXiv:2309.10314  [pdf, other

    cs.RO cs.CV

    Dive Deeper into Rectifying Homography for Stereo Camera Online Self-Calibration

    Authors: Hongbo Zhao, Yikang Zhang, Qijun Chen, Rui Fan

    Abstract: Accurate estimation of stereo camera extrinsic parameters is the key to guarantee the performance of stereo matching algorithms. In prior arts, the online self-calibration of stereo cameras has commonly been formulated as a specialized visual odometry problem, without taking into account the principles of stereo rectification. In this paper, we first delve deeply into the concept of rectifying hom… ▽ More

    Submitted 3 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

  48. arXiv:2308.16555  [pdf, other

    cs.CV cs.RO

    E3CM: Epipolar-Constrained Cascade Correspondence Matching

    Authors: Chenbo Zhou, Shuai Su, Qijun Chen, Rui Fan

    Abstract: Accurate and robust correspondence matching is of utmost importance for various 3D computer vision tasks. However, traditional explicit programming-based methods often struggle to handle challenging scenarios, and deep learning-based methods require large well-labeled datasets for network training. In this article, we introduce Epipolar-Constrained Cascade Correspondence (E3CM), a novel approach t… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: accepted to Neurocomputing

  49. arXiv:2308.15982  [pdf, other

    cs.CL

    MerA: Merging Pretrained Adapters For Few-Shot Learning

    Authors: Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, Dacheng Tao

    Abstract: Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks. However, it often yields subpar results in few-shot learning. AdapterFusion, which assembles pretrained adapters using composition layers tailored to specific tasks, is a possible solution but significantly increases trainable parameters and deployment… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

  50. arXiv:2308.11992  [pdf

    q-bio.TO cs.AI

    Critical Evaluation of Artificial Intelligence as Digital Twin of Pathologist for Prostate Cancer Pathology

    Authors: Okyaz Eminaga, Mahmoud Abbas, Christian Kunder, Yuri Tolkach, Ryan Han, James D. Brooks, Rosalie Nolley, Axel Semjonow, Martin Boegemann, Robert West, ** Long, Richard Fan, Olaf Bettendorf

    Abstract: Prostate cancer pathology plays a crucial role in clinical management but is time-consuming. Artificial intelligence (AI) shows promise in detecting prostate cancer and grading patterns. We tested an AI-based digital twin of a pathologist, vPatho, on 2,603 histology images of prostate tissue stained with hematoxylin and eosin. We analyzed various factors influencing tumor-grade disagreement betwee… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: Under Review