Skip to main content

Showing 1–50 of 182 results for author: Luo, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19859  [pdf, other

    cs.AI cs.HC cs.MM

    MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, **gdong Sun, Qi He, Wangmeng Xiang, Hanyuan Chen, **-Peng Lan, Xianhui Lin, Kang Zhu, Bin Luo, Yifeng Geng, Xuansong Xie, Alexander G. Hauptmann

    Abstract: MetaDesigner revolutionizes artistic typography synthesis by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. At the core of this framework lies a multi-agent system comprising the Pipeline, Glyph, and Texture agents, which collectively enable the creation of customized WordArt, ranging from semantic enhancements to the imposition… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 18 pages, 16 figures, Project: https://modelscope.cn/studios/WordArt/WordArt

  2. arXiv:2406.14846  [pdf, other

    cs.LG

    Graph Edge Representation via Tensor Product Graph Convolutional Representation

    Authors: Bo Jiang, Sheng Ge, Ziyan Zhang, Beibei Wang, ** Tang, Bin Luo

    Abstract: Graph Convolutional Networks (GCNs) have been widely studied. The core of GCNs is the definition of convolution operators on graphs. However, existing Graph Convolution (GC) operators are mainly defined on adjacency matrix and node features and generally focus on obtaining effective node embeddings which cannot be utilized to address the graphs with (high-dimensional) edge features. To address thi… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.12784  [pdf, other

    cs.CL

    UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions

    Authors: Xunzhi Wang, Zhuowei Zhang, Qiongyu Li, Gaonan Chen, Mengting Hu, Zhiyu li, Bitong Luo, Hang Gao, Zhixin Han, Haotian Wang

    Abstract: The rapid development of large language models (LLMs) has shown promising practical results. However, their low interpretability often leads to errors in unforeseen circumstances, limiting their utility. Many works have focused on creating comprehensive evaluation systems, but previous benchmarks have primarily assessed problem-solving abilities while neglecting the response's uncertainty, which m… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Under review

  4. arXiv:2406.04594  [pdf, other

    cs.DC cs.AI cs.LG

    Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

    Authors: Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu

    Abstract: The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  5. arXiv:2406.01127  [pdf, other

    cs.CV

    Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection

    Authors: Kunpeng Wang, Zhengzheng Tu, Chenglong Li, Cheng Zhang, Bin Luo

    Abstract: Multi-modal salient object detection (MSOD) aims to boost saliency detection performance by integrating visible sources with depth or thermal infrared ones. Existing methods generally design different fusion schemes to handle certain issues or challenges. Although these fusion schemes are effective at addressing specific issues or challenges, they may struggle to handle multiple complex challenges… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by TCSVT 2024

  6. arXiv:2406.00917  [pdf, other

    cs.CV

    Alignment-Free RGBT Salient Object Detection: Semantics-guided Asymmetric Correlation Network and A Unified Benchmark

    Authors: Kunpeng Wang, Danying Lin, Chenglong Li, Zhengzheng Tu, Bin Luo

    Abstract: RGB and Thermal (RGBT) Salient Object Detection (SOD) aims to achieve high-quality saliency prediction by exploiting the complementary information of visible and thermal image pairs, which are initially captured in an unaligned manner. However, existing methods are tailored for manually aligned image pairs, which are labor-intensive, and directly applying these methods to original unaligned image… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Accepted by TMM 2024

  7. arXiv:2405.19688  [pdf, other

    cs.CV

    DNPM: A Neural Parametric Model for the Synthesis of Facial Geometric Details

    Authors: Haitao Cao, Bao** Cheng, Qiran Pu, Haocheng Zhang, Bin Luo, Yixiang Zhuang, Juncong Lin, Liyan Chen, Xuan Cheng

    Abstract: Parametric 3D models have enabled a wide variety of computer vision and graphics tasks, such as modeling human faces, bodies and hands. In 3D face modeling, 3DMM is the most widely used parametric model, but can't generate fine geometric details solely from identity and expression inputs. To tackle this limitation, we propose a neural parametric model named DNPM for the facial geometric details, w… ▽ More

    Submitted 13 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  8. arXiv:2405.18078  [pdf, other

    cs.CV

    Edge-guided and Class-balanced Active Learning for Semantic Segmentation of Aerial Images

    Authors: Lianlei Shan, Weiqiang Wang, Ke Lv, Bin Luo

    Abstract: Semantic segmentation requires pixel-level annotation, which is time-consuming. Active Learning (AL) is a promising method for reducing data annotation costs. Due to the gap between aerial and natural images, the previous AL methods are not ideal, mainly caused by unreasonable labeling units and the neglect of class imbalance. Previous labeling units are based on images or regions, which does not… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  9. arXiv:2405.18044  [pdf, other

    cs.MA cs.AI

    Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation

    Authors: Jiaqi Shao, Tianjun Yuan, Tao Lin, Xuanyu Cao, Bing Luo

    Abstract: Cognitive abilities, such as Theory of Mind (ToM), play a vital role in facilitating cooperation in human social interactions. However, our study reveals that agents with higher ToM abilities may not necessarily exhibit better cooperative behavior compared to those with lower ToM abilities. To address this challenge, we propose a novel matching coalition mechanism that leverages the strengths of a… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  10. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhi**g Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Hai** Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  11. arXiv:2405.02717  [pdf, other

    cs.CV

    AFter: Attention-based Fusion Router for RGBT Tracking

    Authors: Andong Lu, Wanyu Wang, Chenglong Li, ** Tang, Bin Luo

    Abstract: Multi-modal feature fusion as a core investigative component of RGBT tracking emerges numerous fusion studies in recent years. However, existing RGBT tracking methods widely adopt fixed fusion structures to integrate multi-modal feature, which are hard to handle various challenges in dynamic scenarios. To address this problem, this work presents a novel \emph{A}ttention-based \emph{F}usion rou\emp… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Peer review

  12. arXiv:2404.14581  [pdf, other

    cs.CV cs.AI cs.CR

    The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking

    Authors: Yuying Li, Zeyan Liu, Junyi Zhao, Liangqin Ren, Fengjun Li, Jiebo Luo, Bo Luo

    Abstract: Generative AI models can produce high-quality images based on text prompts. The generated images often appear indistinguishable from images generated by conventional optical photography devices or created by human artists (i.e., real images). While the outstanding performance of such generative models is generally well received, security concerns arise. For instance, such image generators could be… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  13. arXiv:2404.13804  [pdf, other

    cs.DC cs.LG cs.NI eess.SY

    Adaptive Heterogeneous Client Sampling for Federated Learning over Wireless Networks

    Authors: Bing Luo, Wenli Xiao, Shiqiang Wang, Jianwei Huang, Leandros Tassiulas

    Abstract: Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high deg… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Published in IEEE Transactions on Mobile Computing (TMC). arXiv admin note: substantial text overlap with arXiv:2112.11256

  14. arXiv:2404.06041  [pdf, ps, other

    cs.SE

    On Evaluating the Efficiency of Source Code Generated by LLMs

    Authors: Changan Niu, Ting Zhang, Chuanyi Li, Bin Luo, Vincent Ng

    Abstract: Recent years have seen the remarkable capabilities of large language models (LLMs) for code generation. Different from existing work that evaluate the correctness of the code generated by LLMs, we propose to further evaluate its efficiency. More efficient code can lead to higher performance and execution efficiency of programs and software completed by LLM-assisted programming. First, we evaluate… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 1st special event of AI Foundation Models and Software Engineering (FORGE 2024)

  15. arXiv:2403.17651  [pdf, other

    cs.CV

    Exploring Dynamic Transformer for Efficient Object Tracking

    Authors: Jiawen Zhu, Xin Chen, Haiwen Diao, Shuai Li, Jun-Yan He, Chenyang Li, Bin Luo, Dong Wang, Huchuan Lu

    Abstract: The speed-precision trade-off is a critical problem for visual object tracking which usually requires low latency and deployment on constrained resources. Existing solutions for efficient tracking mainly focus on adopting light-weight backbones or modules, which nevertheless come at the cost of a sacrifice in precision. In this paper, inspired by dynamic network routing, we propose DyTrack, a dyna… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  16. arXiv:2403.17460  [pdf, other

    eess.IV cs.CV

    Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

    Authors: Runmin Dong, Shuai Yuan, Bin Luo, Mengxuan Chen, **xiao Zhang, Lixian Zhang, Weijia Li, Juepeng Zheng, Haohuan Fu

    Abstract: Reference-based super-resolution (RefSR) has the potential to build bridges across spatial and temporal resolutions of remote sensing images. However, existing RefSR methods are limited by the faithfulness of content reconstruction and the effectiveness of texture transfer in large scaling factors. Conditional diffusion models have opened up new opportunities for generating realistic high-resoluti… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  17. The Power of Bamboo: On the Post-Compromise Security for Searchable Symmetric Encryption

    Authors: Tianyang Chen, Peng Xu, Stjepan Picek, Bo Luo, Willy Susilo, Hai **, Kaitai Liang

    Abstract: Dynamic searchable symmetric encryption (DSSE) enables users to delegate the keyword search over dynamically updated encrypted databases to an honest-but-curious server without losing keyword privacy. This paper studies a new and practical security risk to DSSE, namely, secret key compromise (e.g., a user's secret key is leaked or stolen), which threatens all the security guarantees offered by exi… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: This is a full version paper that includes the security proof. The paper with the same name has been published by NDSS 2023

    Journal ref: NDSS 2023

  18. arXiv:2403.06454  [pdf, other

    cs.CE

    When Crypto Economics Meet Graph Analytics and Learning

    Authors: Bingqiao Luo

    Abstract: Utilizing graph analytics and learning has proven to be an effective method for exploring aspects of crypto economics such as network effects, decentralization, tokenomics, and fraud detection. However, the majority of existing research predominantly focuses on leading cryptocurrencies, namely Bitcoin (BTC) and Ethereum (ETH), overlooking the vast diversity among the more than 10,000 cryptocurrenc… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 4 pages, 2 figures

  19. arXiv:2403.05839  [pdf, other

    cs.CV cs.AI cs.NE

    Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline

    Authors: Xiao Wang, Ju Huang, Shiao Wang, Chuanming Tang, Bo Jiang, Yonghong Tian, ** Tang, Bin Luo

    Abstract: Current event-/frame-event based trackers undergo evaluation on short-term tracking datasets, however, the tracking of real-world scenarios involves long-term tracking, and the performance of existing tracking algorithms in these scenarios remains unclear. In this paper, we first propose a new long-term and large-scale frame-event single object tracking dataset, termed FELT. It contains 742 videos… ▽ More

    Submitted 3 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: In Peer Review

  20. arXiv:2403.02969  [pdf, other

    cs.CV

    Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception

    Authors: Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, **-Peng Lan, Bin Luo, Xuansong Xie

    Abstract: Multimodal Large Language Model (MLLMs) leverages Large Language Models as a cognitive framework for diverse visual-language tasks. Recent efforts have been made to equip MLLMs with visual perceiving and grounding capabilities. However, there still remains a gap in providing fine-grained pixel-level perceptions and extending interactions beyond text-specific inputs. In this work, we propose {\bf{A… ▽ More

    Submitted 25 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  21. arXiv:2403.01210  [pdf, other

    cs.CV cs.AI

    SAR-AE-SFP: SAR Imagery Adversarial Example in Real Physics domain with Target Scattering Feature Parameters

    Authors: Jiahao Cui, Jiale Duan, Binyan Luo, Hang Cao, Wang Guo, Haifeng Li

    Abstract: Deep neural network-based Synthetic Aperture Radar (SAR) target recognition models are susceptible to adversarial examples. Current adversarial example generation methods for SAR imagery primarily operate in the 2D digital domain, known as image adversarial examples. Recent work, while considering SAR imaging scatter mechanisms, fails to account for the actual imaging process, rendering attacks in… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 10 pages, 9 figures, 2 tables

  22. arXiv:2403.01182  [pdf, other

    cs.CR

    d-DSE: Distinct Dynamic Searchable Encryption Resisting Volume Leakage in Encrypted Databases

    Authors: Dongli Liu, Wei Wang, Peng Xu, Laurence T. Yang, Bo Luo, Kaitai Liang

    Abstract: Dynamic Searchable Encryption (DSE) has emerged as a solution to efficiently handle and protect large-scale data storage in encrypted databases (EDBs). Volume leakage poses a significant threat, as it enables adversaries to reconstruct search queries and potentially compromise the security and privacy of data. Padding strategies are common countermeasures for the leakage, but they significantly in… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 23pages, 13 figures, will be published in USENIX Security'24

  23. arXiv:2402.10464  [pdf, other

    cs.LG cs.NI

    FedKit: Enabling Cross-Platform Federated Learning for Android and iOS

    Authors: Sichang He, Beilong Tang, Boyan Zhang, Jiaoqi Shao, Xiaomin Ouyang, Daniel Nata Nugraha, Bing Luo

    Abstract: We present FedKit, a federated learning (FL) system tailored for cross-platform FL research on Android and iOS devices. FedKit pipelines cross-platform FL development by enabling model conversion, hardware-accelerated training, and cross-platform model aggregation. Our FL workflow supports flexible machine learning operations (MLOps) in production, facilitating continuous model delivery and traini… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: This work has been accepted for demonstration on IEEE International Conference on Computer Communications (INFOCOM) 2024

  24. arXiv:2402.10097  [pdf, other

    cs.LG cs.NI

    Adaptive Federated Learning in Heterogeneous Wireless Networks with Independent Sampling

    Authors: Jiaxiang Geng, Yanzhao Hou, Xiaofeng Tao, Juncheng Wang, Bing Luo

    Abstract: Federated Learning (FL) algorithms commonly sample a random subset of clients to address the straggler issue and improve communication efficiency. While recent works have proposed various client sampling methods, they have limitations in joint system and data heterogeneity design, which may not align with practical heterogeneous wireless networks. In this work, we advocate a new independent client… ▽ More

    Submitted 13 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 6 pages, 5 figures, accepted for publication in IEEE International Conference on Communications (ICC)

  25. arXiv:2402.01276  [pdf, other

    cs.AI

    Federated Unlearning: a Perspective of Stability and Fairness

    Authors: Jiaqi Shao, Tao Lin, Xuanyu Cao, Bing Luo

    Abstract: This paper explores the multifaceted consequences of federated unlearning (FU) with data heterogeneity. We introduce key metrics for FU assessment, concentrating on verification, global stability, and local fairness, and investigate the inherent trade-offs. Furthermore, we formulate the unlearning process with data heterogeneity through an optimization framework. Our key contribution lies in a com… ▽ More

    Submitted 1 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  26. arXiv:2401.17916  [pdf, other

    cs.CV

    Source-free Domain Adaptive Object Detection in Remote Sensing Images

    Authors: Weixing Liu, Jun Liu, Xin Su, Han Nie, Bin Luo

    Abstract: Recent studies have used unsupervised domain adaptive object detection (UDAOD) methods to bridge the domain gap in remote sensing (RS) images. However, UDAOD methods typically assume that the source domain data can be accessed during the domain adaptation process. This setting is often impractical in the real world due to RS data privacy and transmission difficulty. To address this challenge, we p… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 14 pages, 11 figures

  27. arXiv:2401.11459  [pdf, other

    cs.AR cs.AI cs.LG

    AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology

    Authors: Rongqing Cong, Wenyang He, Mingxuan Li, Bangning Luo, Zebin Yang, Yuchao Yang, Ru Huang, Bonan Yan

    Abstract: Large language models (LLMs) with Transformer architectures have become phenomenal in natural language processing, multimodal generative artificial intelligence, and agent-oriented artificial intelligence. The self-attention module is the most dominating sub-structure inside Transformer-based LLMs. Computation using general-purpose graphics processing units (GPUs) inflicts reckless demand for I/O… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: for associated source codes, see https://bonany.cc/attentionleg

  28. arXiv:2401.03638  [pdf, other

    cs.LG cs.CV

    Unifying Graph Contrastive Learning via Graph Message Augmentation

    Authors: Ziyan Zhang, Bo Jiang, ** Tang, Bin Luo

    Abstract: Graph contrastive learning is usually performed by first conducting Graph Data Augmentation (GDA) and then employing a contrastive learning pipeline to train GNNs. As we know that GDA is an important issue for graph contrastive learning. Various GDAs have been developed recently which mainly involve drop** or perturbing edges, nodes, node attributes and edge attributes. However, to our knowledge… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  29. arXiv:2401.03197  [pdf, other

    cs.AI cs.LG

    Decision Making in Non-Stationary Environments with Policy-Augmented Search

    Authors: Ava Pettet, Yunuo Zhang, Baiting Luo, Kyle Wray, Hendrik Baier, Aron Laszka, Abhishek Dubey, Ayan Mukhopadhyay

    Abstract: Sequential decision-making under uncertainty is present in many important problems. Two popular approaches for tackling such problems are reinforcement learning and online search (e.g., Monte Carlo tree search). While the former learns a policy by interacting with the environment (typically done before execution), the latter uses a generative model of the environment to sample promising action tra… ▽ More

    Submitted 20 January, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

    Comments: Extended Abstract accepted for presentation at AAMAS 2024

  30. arXiv:2401.01841  [pdf, other

    cs.AI cs.LG

    Act as You Learn: Adaptive Decision-Making in Non-Stationary Markov Decision Processes

    Authors: Baiting Luo, Yunuo Zhang, Abhishek Dubey, Ayan Mukhopadhyay

    Abstract: A fundamental (and largely open) challenge in sequential decision-making is dealing with non-stationary environments, where exogenous environmental conditions change over time. Such problems are traditionally modeled as non-stationary Markov decision processes (NSMDP). However, existing approaches for decision-making in NSMDPs have two major shortcomings: first, they assume that the updated enviro… ▽ More

    Submitted 21 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Accepted for publication at the International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), 2024

  31. arXiv:2401.01699  [pdf, other

    cs.CV cs.CL cs.MM

    WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, **gdong Sun, Wangmeng Xiang, Yusen Hu, Xianhui Lin, Xiaoyang Kang, Zengke **, Bin Luo, Yifeng Geng, Xuansong Xie, **gren Zhou

    Abstract: This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope. We address the challenge of simplifying artistic typography for non-professionals by offering a dynamic, adaptive, and computationally efficient alternative to traditional rigid templates. Our approach leverages the power of LLMs to u… ▽ More

    Submitted 12 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Spotlight Paper at the Workshop on Machine Learning for Creativity and Design, 37th Conference on Neural Information Processing Systems (NeurIPS 2023). 5 pages, 5 figures

  32. arXiv:2401.01674  [pdf, other

    cs.CV

    Transformer RGBT Tracking with Spatio-Temporal Multimodal Tokens

    Authors: Dengdi Sun, Yajie Pan, Andong Lu, Chenglong Li, Bin Luo

    Abstract: Many RGBT tracking researches primarily focus on modal fusion design, while overlooking the effective handling of target appearance changes. While some approaches have introduced historical frames or fuse and replace initial templates to incorporate temporal information, they have the risk of disrupting the original target appearance and accumulating errors over time. To alleviate these limitation… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  33. arXiv:2312.17448  [pdf, other

    cs.CV

    Tracking with Human-Intent Reasoning

    Authors: Jiawen Zhu, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Huchuan Lu, Yifeng Geng, Xuansong Xie

    Abstract: Advances in perception modeling have significantly improved the performance of object tracking. However, the current methods for specifying the target object in the initial frame are either by 1) using a box or mask template, or by 2) providing an explicit language description. These manners are cumbersome and do not allow the tracker to have self-reasoning ability. Therefore, this work proposes a… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 8 pages, 4 figures

  34. arXiv:2312.16246  [pdf, other

    cs.CV

    Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning

    Authors: Andong Lu, Tianrui Zha, Chenglong Li, ** Tang, Xiaofeng Wang, Bin Luo

    Abstract: Prevalent nighttime ReID methods typically combine relighting networks and ReID networks in a sequential manner, which not only restricts the ReID performance by the quality of relighting images, but also neglects the effective collaborative modeling between image relighting and person ReID tasks. To handle these problems, we propose a novel Collaborative Enhancement Network called CENet, which pe… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  35. arXiv:2312.16244  [pdf, other

    cs.CV

    Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks

    Authors: Andong Lu, Jiacong Zhao, Chenglong Li, ** Tang, Bin Luo

    Abstract: Current RGBT tracking research relies on the complete multi-modal input, but modal information might miss due to some factors such as thermal sensor self-calibration and data transmission error, called modality-missing challenge in this work. To address this challenge, we propose a novel invertible prompt learning approach, which integrates the content-preserving prompts into a well-trained tracki… ▽ More

    Submitted 20 March, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

  36. arXiv:2312.15614  [pdf, other

    cs.SE cs.AI cs.CL

    A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Software Engineering Tasks

    Authors: Wentao Zou, Qi Li, Jidong Ge, Chuanyi Li, Xiaoyu Shen, Liguo Huang, Bin Luo

    Abstract: Pre-trained models (PTMs) have achieved great success in various Software Engineering (SE) downstream tasks following the ``pre-train then fine-tune'' paradigm. As fully fine-tuning all parameters of PTMs can be computationally expensive, a widely used solution is parameter-efficient fine-tuning (PEFT), which freezes PTMs while introducing extra parameters. Though work has been done to test PEFT m… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  37. arXiv:2312.12358  [pdf, other

    cs.IT eess.SP

    Localization and Discrete Beamforming with a Large Reconfigurable Intelligent Surface

    Authors: Baojia Luo, Yili Deng, Miaomiao Dong, Zhongyi Huang, Xiang Chen, Wei Han, Bo Bai

    Abstract: In millimeter-wave (mmWave) cellular systems, reconfigurable intelligent surfaces (RISs) are foreseeably deployed with a large number of reflecting elements to achieve high beamforming gains. The large-sized RIS will make radio links fall in the near-field localization regime with spatial non-stationarity issues. Moreover, the discrete phase restriction on the RIS reflection coefficient incurs exp… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 13 pages

  38. arXiv:2312.05908  [pdf, other

    cs.CV cs.AI cs.HC

    Multi-Energy Guided Image Translation with Stochastic Differential Equations for Near-Infrared Facial Expression Recognition

    Authors: Bingjun Luo, Zewen Wang, **peng Wang, Junjie Zhu, Xibin Zhao, Yue Gao

    Abstract: Illumination variation has been a long-term challenge in real-world facial expression recognition(FER). Under uncontrolled or non-visible light conditions, Near-infrared (NIR) can provide a simple and alternative solution to obtain high-quality images and supplement the geometric and texture details that are missing in the visible domain. Due to the lack of existing large-scale NIR facial expressi… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  39. arXiv:2312.05907  [pdf, other

    cs.CV cs.AI cs.HC

    Hypergraph-Guided Disentangled Spectrum Transformer Networks for Near-Infrared Facial Expression Recognition

    Authors: Bingjun Luo, Haowen Wang, **peng Wang, Junjie Zhu, Xibin Zhao, Yue Gao

    Abstract: With the strong robusticity on illumination variations, near-infrared (NIR) can be an effective and essential complement to visible (VIS) facial expression recognition in low lighting or complete darkness conditions. However, facial expression recognition (FER) from NIR images presents more challenging problem than traditional FER due to the limitations imposed by the data scale and the difficulty… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  40. arXiv:2311.18592  [pdf, other

    cs.CV cs.AI

    Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models

    Authors: Dong Li, Jiandong **, Yuhao Zhang, Yanlin Zhong, Yaoyang Wu, Lan Chen, Xiao Wang, Bin Luo

    Abstract: Pattern recognition through the fusion of RGB frames and Event streams has emerged as a novel research area in recent years. Current methods typically employ backbone networks to individually extract the features of RGB frames and event streams, and subsequently fuse these features for pattern recognition. However, we posit that these methods may suffer from key issues like sematic gaps and small-… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: In Peer Review

  41. arXiv:2311.17096  [pdf, other

    cs.CV

    Robust Transductive Few-shot Learning via Joint Message Passing and Prototype-based Soft-label Propagation

    Authors: Jiahui Wang, Qin Xu, Bo Jiang, Bin Luo

    Abstract: Few-shot learning (FSL) aims to develop a learning model with the ability to generalize to new classes using a few support samples. For transductive FSL tasks, prototype learning and label propagation methods are commonly employed. Prototype methods generally first learn the representative prototypes from the support set and then determine the labels of queries based on the metric between query sa… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  42. arXiv:2311.16835  [pdf, other

    cs.CV

    Unified-modal Salient Object Detection via Adaptive Prompt Learning

    Authors: Kunpeng Wang, Chenglong Li, Zhengzheng Tu, Zhengyi Liu, Bin Luo

    Abstract: Existing single-modal and multi-modal salient object detection (SOD) methods focus on designing specific architectures tailored for their respective tasks. However, develo** completely different models for different tasks leads to labor and time consumption, as well as high computational and practical deployment costs. In this paper, we attempt to address both single-modal and multi-modal SOD in… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 13 pages, 11 figures

  43. arXiv:2310.18332  [pdf, other

    cs.CL cs.AI cs.CV cs.GR

    WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, **gdong Sun, Wangmeng Xiang, Xianhui Lin, Xiaoyang Kang, Zengke **, Yusen Hu, Bin Luo, Yifeng Geng, Xuansong Xie, **gren Zhou

    Abstract: This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on the Large Language Model (LLM). The system incorporates four key modules: the LLM Engine, SemTypo, StyTypo, and TexTypo modules. 1) The LLM Engine, empowered by the LLM (e.g., GPT-3.5), interprets user inputs and generates actionable prompts for the other modules, thereby transforming abst… ▽ More

    Submitted 26 November, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023, 10 pages, 11 figures, 1 table, the system is at https://www.modelscope.cn/studios/WordArt/WordArt

  44. Counterfactual Prediction Under Selective Confounding

    Authors: Sohaib Kiani, Jared Barton, Jon Sushinsky, Lynda Heimbach, Bo Luo

    Abstract: This research addresses the challenge of conducting interpretable causal inference between a binary treatment and its resulting outcome when not all confounders are known. Confounders are factors that have an influence on both the treatment and the outcome. We relax the requirement of knowing all confounders under desired treatment, which we refer to as Selective Confounding, to enable causal infe… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Journal ref: IOS Press Ebooks pp 1256-1263. Volume 372: ECAI'23 (2023)

  45. arXiv:2310.11709  [pdf, other

    cs.AI

    Live Graph Lab: Towards Open, Dynamic and Real Transaction Graphs with NFT

    Authors: Zhen Zhang, Bingqiao Luo, Shengliang Lu, Bingsheng He

    Abstract: Numerous studies have been conducted to investigate the properties of large-scale temporal graphs. Despite the ubiquity of these graphs in real-world scenarios, it's usually impractical for us to obtain the whole real-time graphs due to privacy concerns and technical limitations. In this paper, we introduce the concept of {\it Live Graph Lab} for temporal graphs, which enables open, dynamic and re… ▽ More

    Submitted 18 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023, Datasets and Benchmarks Track

  46. arXiv:2310.11417  [pdf, other

    cs.CV

    VcT: Visual change Transformer for Remote Sensing Image Change Detection

    Authors: Bo Jiang, Zitian Wang, Xixi Wang, Ziyan Zhang, Lan Chen, Xiao Wang, Bin Luo

    Abstract: Existing visual change detectors usually adopt CNNs or Transformers for feature representation learning and focus on learning effective representation for the changed regions between images. Although good performance can be obtained by enhancing the features of the change regions, however, these works are still limited mainly due to the ignorance of mining the unchanged background context informat… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE Transactions on Geoscience and Remote Sensing (TGRS) 2023

  47. arXiv:2310.01015  [pdf, other

    cs.SI cs.AI

    EX-Graph: A Pioneering Dataset Bridging Ethereum and X

    Authors: Qian Wang, Zhen Zhang, Zemin Liu, Shengliang Lu, Bingqiao Luo, Bingsheng He

    Abstract: While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data. This constraint limits the incorporation of relevant social network data into blockchain analysis, thereby diminishing the breadth and depth of insight that can be derived. To address the above limitation, we introduce EX-Graph, a novel dataset that authentically links Et… ▽ More

    Submitted 17 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  48. arXiv:2309.10491  [pdf, other

    cs.CV cs.RO

    DCPT: Darkness Clue-Prompted Tracking in Nighttime UAVs

    Authors: Jiawen Zhu, Huayi Tang, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Shihao Qiu, Shengming Li, Huchuan Lu

    Abstract: Existing nighttime unmanned aerial vehicle (UAV) trackers follow an "Enhance-then-Track" architecture - first using a light enhancer to brighten the nighttime video, then employing a daytime tracker to locate the object. This separate enhancement and tracking fails to build an end-to-end trainable vision system. To address this, we propose a novel architecture called Darkness Clue-Prompted Trackin… ▽ More

    Submitted 14 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted by ICRA2024

  49. Practical Program Repair via Preference-based Ensemble Strategy

    Authors: Wenkang Zhong, Chuanyi Li, Kui Liu, Tongtong Xu, Tegawendé F. Bissyandé, Jidong Ge, Bin Luo, Vincent Ng

    Abstract: To date, over 40 Automated Program Repair (APR) tools have been designed with varying bug-fixing strategies, which have been demonstrated to have complementary performance in terms of being effective for different bug classes. Intuitively, it should be feasible to improve the overall bug-fixing performance of APR via assembling existing tools. Unfortunately, simply invoking all available APR tools… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: accepted by icse2024 early

  50. arXiv:2309.04828  [pdf, other

    cs.SE

    FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate Representations

    Authors: Changan Niu, Chuanyi Li, Vincent Ng, David Lo, Bin Luo

    Abstract: While the majority of existing pre-trained models from code learn source code features such as code tokens and abstract syntax trees, there are some other works that focus on learning from compiler intermediate representations (IRs). Existing IR-based models typically utilize IR features such as instructions, control and data flow graphs (CDFGs), call graphs, etc. However, these methods confuse va… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

    Comments: ICSE 2024 First Cycle