Skip to main content

Showing 1–50 of 473 results for author: Feng, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19749  [pdf, other

    eess.IV cs.CV

    SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation

    Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Automatic vessel segmentation is paramount for develo** next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.19646  [pdf, other

    cs.RO

    Time-optimal Flight in Cluttered Environments via Safe Reinforcement Learning

    Authors: Wei Xiao, Zhaohan Feng, Ziyu Zhou, Jian Sun, Gang Wang, Jie Chen

    Abstract: This paper addresses the problem of guiding a quadrotor through a predefined sequence of waypoints in cluttered environments, aiming to minimize the flight time while avoiding collisions. Previous approaches either suffer from prolonged computational time caused by solving complex non-convex optimization problems or are limited by the inherent smoothness of polynomial trajectory representations, t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 7 pages, 3 figures,

  3. arXiv:2406.18254  [pdf, other

    cs.IR cs.AI cs.MM

    Improving the Consistency in Cross-Lingual Cross-Modal Retrieval with 1-to-K Contrastive Learning

    Authors: Zhijie Nie, Richong Zhang, Zhangchi Feng, Hailang Huang, Xudong Liu

    Abstract: Cross-lingual Cross-modal Retrieval (CCR) is an essential task in web search, which aims to break the barriers between modality and language simultaneously and achieves image-text retrieval in the multi-lingual scenario with a single model. In recent years, excellent progress has been made based on cross-lingual cross-modal pre-training; particularly, the methods based on contrastive learning on l… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024 Research Track

  4. arXiv:2406.17460  [pdf, other

    cs.CV

    Investigating Self-Supervised Methods for Label-Efficient Learning

    Authors: Srinivasa Rao Nandam, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais

    Abstract: Vision transformers combined with self-supervised learning have enabled the development of models which scale across large datasets for several downstream tasks like classification, segmentation and detection. The low-shot learning capability of these models, across several low-shot downstream tasks, has been largely under explored. We perform a system level study of different self supervised pret… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  5. arXiv:2406.17450  [pdf, other

    cs.CV cs.AI

    Pseudo Labelling for Enhanced Masked Autoencoders

    Authors: Srinivasa Rao Nandam, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais

    Abstract: Masked Image Modeling (MIM)-based models, such as SdAE, CAE, GreenMIM, and MixAE, have explored different strategies to enhance the performance of Masked Autoencoders (MAE) by modifying prediction, loss functions, or incorporating additional architectural components. In this paper, we propose an enhanced approach that boosts MAE performance by integrating pseudo labelling for both class and data t… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.15741  [pdf, other

    cs.CL cs.AI cs.LG

    Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level

    Authors: Zhaopeng Feng, Ruizhe Chen, Yan Zhang, Zijie Meng, Zuozhu Liu

    Abstract: General-purpose Large Language Models (LLMs) like GPT-4 have achieved remarkable advancements in machine translation (MT) by leveraging extensive web content. On the other hand, translation-specific LLMs are built by pre-training on domain-specific monolingual corpora and fine-tuning with human-annotated translation data. Despite the superior performance, these methods either demand an unprecedent… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Our code is available at https://github.com/fzp0424/Ladder

  7. arXiv:2406.13149  [pdf, other

    cs.CV

    High-Fidelity Facial Albedo Estimation via Texture Quantization

    Authors: Zimin Ran, Xingyu Ren, Xiang An, Kaicheng Yang, Xiangzi Dai, Ziyong Feng, Jia Guo, Linchao Zhu, Jiankang Deng

    Abstract: Recent 3D face reconstruction methods have made significant progress in shape estimation, but high-fidelity facial albedo reconstruction remains challenging. Existing methods depend on expensive light-stage captured data to learn facial albedo maps. However, a lack of diversity in subjects limits their ability to recover high-fidelity results. In this paper, we present a novel facial albedo recons… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  8. arXiv:2406.12315  [pdf, other

    cs.AI

    PruningBench: A Comprehensive Benchmark of Structural Pruning

    Authors: Haoling Li, Changhao Li, Mengqi Xue, Gongfan Fang, Sheng Zhou, Zunlei Feng, Huiqiong Wang, Yong Wang, Lechao Cheng, Mingli Song, Jie Song

    Abstract: Structural pruning has emerged as a promising approach for producing more efficient models. Nevertheless, the community suffers from a lack of standardized benchmarks and metrics, leaving the progress in this area not fully comprehended. To fill this gap, we present the first comprehensive benchmark, termed \textit{PruningBench}, for structural pruning. PruningBench showcases the following three c… ▽ More

    Submitted 28 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024 Datasets and Benchmarks Track

  9. arXiv:2406.10500  [pdf, other

    cs.LG cs.SI

    Geodesic Distance Between Graphs: A Spectral Metric for Assessing the Stability of Graph Neural Networks

    Authors: Soumen Sikder Shuvo, Ali Aghdaei, Zhuo Feng

    Abstract: This paper presents a spectral framework for assessing the generalization and stability of Graph Neural Networks (GNNs) by introducing a Graph Geodesic Distance (GGD) metric. For two different graphs with the same number of nodes, our framework leverages a spectral graph matching procedure to find node correspondence so that the geodesic distance between them can be subsequently computed by solvin… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  10. arXiv:2406.10297  [pdf, other

    cs.CL cs.AI

    SememeLM: A Sememe Knowledge Enhanced Method for Long-tail Relation Representation

    Authors: Shuyi Li, Shaojuan Wu, Xiaowang Zhang, Zhiyong Feng

    Abstract: Recognizing relations between two words is a fundamental task with the broad applications. Different from extracting relations from text, it is difficult to identify relations among words without their contexts. Especially for long-tail relations, it becomes more difficult due to inadequate semantic features. Existing approaches based on language models (LMs) utilize rich knowledge of LMs to enhan… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  11. arXiv:2406.09688  [pdf, other

    cs.CL

    FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation

    Authors: Zijian Feng, Hanzhang Zhou, Zixiao Zhu, Kezhi Mao

    Abstract: Controllable text generation (CTG) seeks to craft texts adhering to specific attributes, traditionally employing learning-based techniques such as training, fine-tuning, or prefix-tuning with attribute-specific datasets. These approaches, while effective, demand extensive computational and data resources. In contrast, some proposed learning-free alternatives circumvent learning but often yield inf… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: ACL 2024

  12. arXiv:2406.09455  [pdf, other

    cs.CV cs.AI cs.CL

    Pandora: Towards General World Model with Natural Language Actions and Video States

    Authors: Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

    Abstract: World models simulate future states of the world in response to different actions. They facilitate interactive content creation and provides a foundation for grounded, long-horizon reasoning. Current foundation models do not fully meet the capabilities of general world models: large language models (LLMs) are constrained by their reliance on language modality and their limited understanding of the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Website: https://world-model.maitrix.org/

  13. arXiv:2406.09181  [pdf, other

    cs.CV cs.AI

    A Large-scale Universal Evaluation Benchmark For Face Forgery Detection

    Authors: Yijun Bei, Hengrui Lou, **song Geng, Erteng Liu, Lechao Cheng, Jie Song, Mingli Song, Zunlei Feng

    Abstract: With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a si… ▽ More

    Submitted 13 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: This is a paper about constructing a large-scale universal evaluation benchmark for face forgery detection.The full text is 30 pages

  14. arXiv:2406.08829  [pdf, other

    cs.CV cs.CR

    Improving Adversarial Robustness via Feature Pattern Consistency Constraint

    Authors: Jiacong Hu, **gwen Ye, Zunlei Feng, Jiazhen Yang, Shunyu Liu, Xiaotian Yu, Lingxiang Jia, Mingli Song

    Abstract: Convolutional Neural Networks (CNNs) are well-known for their vulnerability to adversarial attacks, posing significant security concerns. In response to these threats, various defense methods have emerged to bolster the model's robustness. However, most existing methods either focus on learning from adversarial perturbations, leading to overfitting to the adversarial examples, or aim to eliminate… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  15. arXiv:2406.06973  [pdf, other

    cs.CV

    RWKV-CLIP: A Robust Vision-Language Representation Learner

    Authors: Tiancheng Gu, Kaicheng Yang, Xiang An, Ziyong Feng, Dongnan Liu, Weidong Cai, Jiankang Deng

    Abstract: Contrastive Language-Image Pre-training (CLIP) has significantly improved performance in various vision-language tasks by expanding the dataset with image-text pairs obtained from websites. This paper further explores CLIP from the perspectives of data and model architecture. To address the prevalence of noisy data and enhance the quality of large-scale image-text data crawled from the internet, w… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 14 pages, 10 figures

  16. arXiv:2406.05187  [pdf, other

    cs.GT cs.AI cs.HC cs.LG

    How to Strategize Human Content Creation in the Era of GenAI?

    Authors: Seyed A. Esmaeili, Kshipra Bhawalkar, Zhe Feng, Di Wang, Haifeng Xu

    Abstract: Generative AI (GenAI) will have significant impact on content creation platforms. In this paper, we study the dynamic competition between a GenAI and a human contributor. Unlike the human, the GenAI's content only improves when more contents are created by human over the time; however, GenAI has the advantage of generating content at a lower cost. We study the algorithmic problem in this dynamic c… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  17. arXiv:2405.20612  [pdf, other

    cs.CL cs.AI

    UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation

    Authors: Hanzhang Zhou, Zijian Feng, Zixiao Zhu, Junlang Qian, Kezhi Mao

    Abstract: Large language models (LLMs) have demonstrated impressive capabilities in various tasks using the in-context learning (ICL) paradigm. However, their effectiveness is often compromised by inherent bias, leading to prompt brittleness, i.e., sensitivity to design settings such as example selection, order, and prompt formatting. Previous studies have addressed LLM bias through external adjustment of m… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  18. arXiv:2405.16078   

    cs.IT

    An Multi-resources Integration Empowered Task Offloading in Internet of Vehicles: From the Perspective of Wireless Interference

    Authors: Xiaowu Liu, Yun Wang, Kan Yu, Dianxia Chen, Dong Li, Qixun Zhang, Zhiyong Feng

    Abstract: The task offloading technology plays a vital role in the Internet of Vehicles (IoV), by satisfying the diversified demands of the vehicles, such as the energy consumption and processing latency of the computing task. Different from the previous works, on the one hand, they ignored the wireless interference of communications among vehicle-to-vehicle (V2V), as well as between vehicles and roadside u… ▽ More

    Submitted 25 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: The paper has been rejected by IEEE Transactions on Communications, apart from the Reviewers' comments, we need reconsider that inaccuracies in the data or results were identified post-submission, necessitating a withdrawal for correction. In addition, considering the plausibility of the simulations, one or more of the authors requested the withdrawal of the manuscript

  19. arXiv:2405.16062  [pdf, other

    cs.IT eess.SP

    Movable Antenna Empowered Physical Layer Security Without Eve's CSI: Joint Optimization of Beamforming and Antenna Positions

    Authors: Zhiyong Feng, Yujia Zhao, Kan Yu, Dong Li

    Abstract: Physical layer security (PLS) technology based on the fixed-position antenna (FPA) has {attracted widespread attention}. Due to the fixed feature of the antennas, current FPA-based PLS schemes cannot fully utilize the spatial degree of freedom, and thus a weaken secure gain in the desired/undesired direction may exist. Different from the concept of FPA, mobile antenna (MA) is a novel technology th… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  20. arXiv:2405.16060  [pdf, other

    cs.IT

    Delay-Effective Task Offloading Technology in Internet of Vehicles: From the Perspective of the Vehicle Platooning

    Authors: Kan Yu, Fuze Zhu, Xiaowu Liu, Zhiyong Feng, Dong Li

    Abstract: The task offloading technology plays a crucial vital role in the Internet of Vehicle (IoV) with the demands of delay minimum, by jointly optimizing the heterogeneous computing resources supported by the vehicles, roadside units (RSUs), and macro base stations (MBSs). In previous works, on the one hand, they ignored the wireless interference among the exchange and sharing of the task data. On the o… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  21. arXiv:2405.15346  [pdf, other

    cs.CL cs.AI cs.LG

    BiSup: Bidirectional Quantization Error Suppression for Large Language Models

    Authors: Minghui Zou, Ronghui Guo, Sai Zhang, Xiaowang Zhang, Zhiyong Feng

    Abstract: As the size and context length of Large Language Models (LLMs) grow, weight-activation quantization has emerged as a crucial technique for efficient deployment of LLMs. Compared to weight-only quantization, weight-activation quantization presents greater challenges due to the presence of outliers in activations. Existing methods have made significant progress by exploring mixed-precision quantizat… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  22. arXiv:2405.14342  [pdf, other

    cs.CV

    RoGS: Large Scale Road Surface Reconstruction based on 2D Gaussian Splatting

    Authors: Zhiheng Feng, Wenhua Wu, Hesheng Wang

    Abstract: Road surface reconstruction plays a crucial role in autonomous driving, which can be used for road lane perception and autolabeling tasks. Recently, mesh-based road surface reconstruction algorithms show promising reconstruction results. However, these mesh-based methods suffer from slow speed and poor rendering quality. In contrast, the 3D Gaussian Splatting (3DGS) shows superior rendering speed… ▽ More

    Submitted 23 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  23. arXiv:2405.12719  [pdf, other

    cs.CR

    How to Train a Backdoor-Robust Model on a Poisoned Dataset without Auxiliary Data?

    Authors: Yuwen Pu, Jiahao Chen, Chunyi Zhou, Zhou Feng, Qingming Li, Chunqiang Hu, Shouling Ji

    Abstract: Backdoor attacks have attracted wide attention from academia and industry due to their great security threat to deep neural networks (DNN). Most of the existing methods propose to conduct backdoor attacks by poisoning the training dataset with different strategies, so it's critical to identify the poisoned samples and then train a clean model on the unreliable dataset in the context of defending b… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 13 pages, under review

  24. arXiv:2405.12706  [pdf, other

    cs.IR

    Disentangled Representation with Cross Experts Covariance Loss for Multi-Domain Recommendation

    Authors: Zhutian Lin, Junwei Pan, Haibin Yu, Xi Xiao, Ximei Wang, Zhixiang Feng, Shifeng Wen, Shudong Huang, Lei Xiao, Jie Jiang

    Abstract: Multi-domain learning (MDL) has emerged as a prominent research area aimed at enhancing the quality of personalized services. The key challenge in MDL lies in striking a balance between learning commonalities across domains while preserving the distinct characteristics of each domain. However, this gives rise to a challenging dilemma. On one hand, a model needs to leverage domain-specific modules,… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  25. arXiv:2405.12652  [pdf, other

    cs.NI eess.SP

    Edge Information Hub-Empowered 6G NTN: Latency-Oriented Resource Orchestration and Configuration

    Authors: Yueshan Lin, Wei Feng, Yunfei Chen, Ning Ge, Zhiyong Feng, Yue Gao

    Abstract: Quick response to disasters is crucial for saving lives and reducing loss. This requires low-latency uploading of situation information to the remote command center. Since terrestrial infrastructures are often damaged in disaster areas, non-terrestrial networks (NTNs) are preferable to provide network coverage, and mobile edge computing (MEC) could be integrated to improve the latency performance.… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  26. arXiv:2405.11891  [pdf, ps, other

    cs.CL cs.AI

    Unveiling and Manipulating Prompt Influence in Large Language Models

    Authors: Zijian Feng, Hanzhang Zhou, Zixiao Zhu, Junlang Qian, Kezhi Mao

    Abstract: Prompts play a crucial role in guiding the responses of Large Language Models (LLMs). However, the intricate role of individual tokens in prompts, known as input saliency, in sha** the responses remains largely underexplored. Existing saliency methods either misalign with LLM generation objectives or rely heavily on linearity assumptions, leading to potential inaccuracies. To address this, we pr… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: ICLR 2024

  27. arXiv:2405.10991  [pdf, other

    cs.LG cs.AI stat.ME

    Relative Counterfactual Contrastive Learning for Mitigating Pretrained Stance Bias in Stance Detection

    Authors: Jiarui Zhang, Shaojuan Wu, Xiaowang Zhang, Zhiyong Feng

    Abstract: Stance detection classifies stance relations (namely, Favor, Against, or Neither) between comments and targets. Pretrained language models (PLMs) are widely used to mine the stance relation to improve the performance of stance detection through pretrained knowledge. However, PLMs also embed ``bad'' pretrained knowledge concerning stance into the extracted stance relation semantics, resulting in pr… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  28. arXiv:2405.10492  [pdf

    cs.CL cs.LG

    Automatic News Generation and Fact-Checking System Based on Language Processing

    Authors: Xirui Peng, Qiming Xu, Zheng Feng, Haopeng Zhao, Lianghao Tan, Yan Zhou, Zecheng Zhang, Chenwei Gong, Yingqiao Zheng

    Abstract: This paper explores an automatic news generation and fact-checking system based on language processing, aimed at enhancing the efficiency and quality of news production while ensuring the authenticity and reliability of the news content. With the rapid development of Natural Language Processing (NLP) and deep learning technologies, automatic news generation systems are capable of extracting key in… ▽ More

    Submitted 20 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    ACM Class: I.5; H.4

  29. arXiv:2405.07430  [pdf, other

    cs.SE cs.CR

    Don't Chase Your Tail! Missing Key Aspects Augmentation in Textual Vulnerability Descriptions of Long-tail Software through Feature Inference

    Authors: Linyi Han, Shidong Pan, Zhenchang Xing, Jiamou Sun, Sofonias Yitagesu, Xiaowang Zhang, Zhiyong Feng

    Abstract: Augmenting missing key aspects in Textual Vulnerability Descriptions (TVDs) for software with a large user base (referred to as non-long-tail software) has greatly advanced vulnerability analysis and software security research. However, these methods often overlook software instances that have a limited user base (referred to as long-tail software) due to limited TVDs, variations in software featu… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  30. arXiv:2405.04235  [pdf, other

    cs.RO cs.LG

    LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning

    Authors: Zeyu Feng, Hao Luan, Pranav Goyal, Harold Soh

    Abstract: Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework,… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  31. arXiv:2405.02906  [pdf, other

    cs.CV

    SalFAU-Net: Saliency Fusion Attention U-Net for Salient Object Detection

    Authors: Kassaw Abraham Mulat, Zhengyong Feng, Tegegne Solomon Eshetie, Ahmed Endris Hasen

    Abstract: Salient object detection (SOD) remains an important task in computer vision, with applications ranging from image segmentation to autonomous driving. Fully convolutional network (FCN)-based methods have made remarkable progress in visual saliency detection over the last few decades. However, these methods have limitations in accurately detecting salient objects, particularly in challenging scenes… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 9 pages, 5 figures

  32. arXiv:2405.00476  [pdf, other

    cs.LG

    A Comprehensive Survey of Dynamic Graph Neural Networks: Models, Frameworks, Benchmarks, Experiments and Challenges

    Authors: ZhengZhao Feng, Rui Wang, TianXing Wang, Mingli Song, Sai Wu, Shuibing He

    Abstract: Dynamic Graph Neural Networks (GNNs) combine temporal information with GNNs to capture structural, temporal, and contextual relationships in dynamic graphs simultaneously, leading to enhanced performance in various applications. As the demand for dynamic GNNs continues to grow, numerous models and frameworks have emerged to cater to different application needs. There is a pressing need for a compr… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Under review of PVLDB2025

  33. arXiv:2405.00168  [pdf, other

    cs.CV

    Revisiting RGBT Tracking Benchmarks from the Perspective of Modality Validity: A New Benchmark, Problem, and Method

    Authors: Zhangyong Tang, Tianyang Xu, Zhenhua Feng, Xuefeng Zhu, He Wang, Pengcheng Shao, Chunyang Cheng, Xiao-Jun Wu, Muhammad Awais, Sara Atito, Josef Kittler

    Abstract: RGBT tracking draws increasing attention due to its robustness in multi-modality warranting (MMW) scenarios, such as nighttime and bad weather, where relying on a single sensing modality fails to ensure stable tracking results. However, the existing benchmarks predominantly consist of videos collected in common scenarios where both RGB and thermal infrared (TIR) information are of sufficient quali… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  34. arXiv:2404.19164  [pdf, ps, other

    cs.CG

    Optimal Bridge, Twin Bridges and Beyond: Inserting Edges into a Road Network to Minimize the Constrained Diameters

    Authors: Zhidan Feng, Henning Fernau, Binhai Zhu

    Abstract: Given a road network modelled as a planar straight-line graph $G=(V,E)$ with $|V|=n$, let $(u,v)\in V\times V$, the shortest path (distance) between $u,v$ is denoted as $δ_G(u,v)$. Let $δ(G)=\max_{(u,v)}δ_G(u,v)$, for $(u,v)\in V\times V$, which is called the diameter of $G$. Given a disconnected road network modelled as two disjoint trees $T_1$ and $T_2$, this paper first aims at inserting one an… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures

    MSC Class: 68 ACM Class: F.2.2

  35. arXiv:2404.17617  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Beyond Traditional Threats: A Persistent Backdoor Attack on Federated Learning

    Authors: Tao Liu, Yuhang Zhang, Zhu Feng, Zhiqin Yang, Chen Xu, Dapeng Man, Wu Yang

    Abstract: Backdoors on federated learning will be diluted by subsequent benign updates. This is reflected in the significant reduction of attack success rate as iterations increase, ultimately failing. We use a new metric to quantify the degree of this weakened backdoor effect, called attack persistence. Given that research to improve this performance has not been widely noted,we propose a Full Combination… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(19): 21359-21367

  36. arXiv:2404.17462  [pdf, other

    cs.NI

    Integrated Sensing and Communication Channel Modeling: A Survey

    Authors: Zhiqing Wei, **zhu Jia, Yangyang Niu, Lin Wang, Huici Wu, Heng Yang, Zhiyong Feng

    Abstract: Integrated sensing and communication (ISAC) is expected to play a crucial role in the sixth-generation (6G) mobile communication systems, offering potential applications in the scenarios of intelligent transportation, smart factories, etc. The performance of radar sensing in ISAC systems is closely related to the characteristics of radar sensing and communication channels. Therefore, ISAC channel… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  37. arXiv:2404.16275  [pdf

    cs.NI cs.IT eess.SP

    Spectrum Sharing Policy in the Asia-Pacific Region

    Authors: Zhiyong Feng, Zhiqing Wei

    Abstract: In this chapter, we investigate the spectrum measurement results in Asia-Pacific region. Then the spectrum sharing policy in the Asia-Pacific region is reviewed in details, where the national projects and strategies on spectrum refarming and spectrum sharing in China, Japan, Singapore, India, Korea and Australia are investigated. Then we introduce the spectrum sharing test-bed that is developed in… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 33 pages, 17figures

  38. arXiv:2404.15146  [pdf, other

    cs.LG cs.CL

    Rethinking LLM Memorization through the Lens of Adversarial Compression

    Authors: Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

    Abstract: Large language models (LLMs) trained on web-scale datasets raise substantial concerns regarding permissible data usage. One major question is whether these models "memorize" all their training data or they integrate many data sources in some way more akin to how a human would learn and synthesize information. The answer hinges, to a large degree, on how we define memorization. In this work, we pro… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: https://locuslab.github.io/acr-memorization

  39. arXiv:2404.14032  [pdf, other

    cs.CV

    1st Place Solution to the 1st SkatingVerse Challenge

    Authors: Tao Sun, Yuanzi Fu, Kaicheng Yang, Jian Wu, Ziyong Feng

    Abstract: This paper presents the winning solution for the 1st SkatingVerse Challenge. We propose a method that involves several steps. To begin, we leverage the DINO framework to extract the Region of Interest (ROI) and perform precise crop** of the raw video footage. Subsequently, we employ three distinct models, namely Unmasked Teacher, UniformerV2, and InfoGCN, to capture different aspects of the data… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 3 pages, 1st SkatingVerse Challenge, 18th IEEE International Conference on Automatic Face and Gesture Recognition workshop

  40. arXiv:2404.12216  [pdf, other

    cs.CV

    ProTA: Probabilistic Token Aggregation for Text-Video Retrieval

    Authors: Han Fang, Xianghao Zang, Chao Ban, Zerun Feng, Lanxiang Zhou, Zhongjiang He, Yongxiang Li, Hao Sun

    Abstract: Text-video retrieval aims to find the most relevant cross-modal samples for a given query. Recent methods focus on modeling the whole spatial-temporal relations. However, since video clips contain more diverse content than captions, the model aligning these asymmetric video-text pairs has a high risk of retrieving many false positive results. In this paper, we propose Probabilistic Token Aggregati… ▽ More

    Submitted 20 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  41. arXiv:2404.11317  [pdf, other

    cs.CV cs.AI

    Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

    Authors: Zhangchi Feng, Richong Zhang, Zhijie Nie

    Abstract: The Composed Image Retrieval (CIR) task aims to retrieve target images using a composed query consisting of a reference image and a modified text. Advanced methods often utilize contrastive learning as the optimization objective, which benefits from adequate positive and negative examples. However, the triplet for CIR incurs high manual annotation costs, resulting in limited positive examples. Fur… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 12 pages, 11 figures

  42. arXiv:2404.09476  [pdf, other

    cs.CV

    FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining

    Authors: Zou Zhen, Yu Hu, Zhao Feng

    Abstract: Images corrupted by rain streaks often lose vital frequency information for perception, and image deraining aims to solve this issue which relies on global and local degradation modeling. Recent studies have witnessed the effectiveness and efficiency of Mamba for perceiving global and local information based on its exploiting local correlation among patches, however, rarely attempts have been expl… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  43. arXiv:2404.09387  [pdf, other

    cs.CV cs.AI cs.LG

    RankCLIP: Ranking-Consistent Language-Image Pretraining

    Authors: Yiming Zhang, Zhuokai Zhao, Zhaorun Chen, Zhili Feng, Zenghui Ding, Yining Sun

    Abstract: Self-supervised contrastive learning models, such as CLIP, have set new benchmarks for vision-language models in many downstream tasks. However, their dependency on rigid one-to-one map**s overlooks the complex and often multifaceted relationships between and within texts and images. To this end, we introduce RANKCLIP, a novel pretraining method that extends beyond the rigid one-to-one matching… ▽ More

    Submitted 20 June, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures, 6 tables. Code and model checkpoints are available at https://github.com/Jam1ezhang/RankCLIP

  44. arXiv:2404.08126  [pdf, other

    cs.GT cs.AI

    Auctions with LLM Summaries

    Authors: Kumar Avinava Dubey, Zhe Feng, Rahul Kidambi, Aranyak Mehta, Di Wang

    Abstract: We study an auction setting in which bidders bid for placement of their content within a summary generated by a large language model (LLM), e.g., an ad auction in which the display is a summary paragraph of multiple ads. This generalizes the classic ad settings such as position auctions to an LLM generated setting, which allows us to handle general display formats. We propose a novel factorized fr… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  45. arXiv:2404.05249  [pdf, other

    cs.RO cs.LG eess.SY

    SAFE-GIL: SAFEty Guided Imitation Learning

    Authors: Yusuf Umut Ciftci, Zeyuan Feng, Somil Bansal

    Abstract: Behavior Cloning is a popular approach to Imitation Learning, in which a robot observes an expert supervisor and learns a control policy. However, behavior cloning suffers from the "compounding error" problem - the policy errors compound as it deviates from the expert demonstrations and might lead to catastrophic system failures, limiting its use in safety-critical applications. On-policy data agg… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  46. arXiv:2404.00814  [pdf, other

    cs.RO eess.SY

    Imposing Exact Safety Specifications in Neural Reachable Tubes

    Authors: Aditya Singh, Zeyuan Feng, Somil Bansal

    Abstract: Hamilton-Jacobi (HJ) reachability analysis is a verification tool that provides safety and performance guarantees for autonomous systems. It is widely adopted because of its ability to handle nonlinear dynamical systems with bounded adversarial disturbances and constraints on states and inputs. However, it involves solving a PDE to compute a safety value function, whose computational and memory co… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Submitted to 63rd IEEE Conference on Decision and Control

  47. arXiv:2404.00509  [pdf, other

    cs.LG cs.CV

    DailyMAE: Towards Pretraining Masked Autoencoders in One Day

    Authors: Jiantao Wu, Shentong Mo, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais

    Abstract: Recently, masked image modeling (MIM), an important self-supervised learning (SSL) method, has drawn attention for its effectiveness in learning data representation from unlabeled data. Numerous studies underscore the advantages of MIM, highlighting how models pretrained on extensive datasets can enhance the performance of downstream tasks. However, the high computational demands of pretraining po… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  48. arXiv:2403.19902  [pdf, other

    cs.CV

    Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification

    Authors: Jianfeng Cai, Yue Ma, Zhixi Feng, Shuyuan Yang

    Abstract: Polarimetric synthetic aperture radar (PolSAR) image interpretation is widely used in various fields. Recently, deep learning has made significant progress in PolSAR image classification. Supervised learning (SL) requires a large amount of labeled PolSAR data with high quality to achieve better performance, however, manually labeled data is insufficient. This causes the SL to fail into overfitting… ▽ More

    Submitted 3 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  49. arXiv:2403.19322  [pdf, other

    cs.CV cs.CL

    Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models

    Authors: Jiaxing Chen, Yuxuan Liu, Dehu Li, Xiang An, Weimo Deng, Ziyong Feng, Yongle Zhao, Yin Xie

    Abstract: The rise of Multimodal Large Language Models (MLLMs), renowned for their advanced instruction-following and reasoning capabilities, has significantly propelled the field of visual reasoning. However, due to limitations in their image tokenization processes, most MLLMs struggle to capture fine details of text and objects in images, especially in high-resolution samples. To overcome this limitation,… ▽ More

    Submitted 18 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: 15 pages, 8 figures

  50. arXiv:2403.18274  [pdf, other

    cs.CV

    DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

    Authors: Jiuming Liu, Dong Zhuo, Zhiheng Feng, Siting Zhu, Chensheng Peng, Zhe Liu, Hesheng Wang

    Abstract: Information inside visual and LiDAR data is well complementary derived from the fine-grained texture of images and massive geometric information in point clouds. However, it remains challenging to explore effective visual-LiDAR fusion, mainly due to the intrinsic data structure inconsistency between two modalities: Images are regular and dense, but LiDAR points are unordered and sparse. To address… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.