Skip to main content

Showing 1–12 of 12 results for author: Ran, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.20076  [pdf, other

    cs.CV

    EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model

    Authors: Yuxuan Zhang, Tianheng Cheng, Rui Hu, ei Liu, Heng Liu, Long** Ran, Xiaoxin Chen, Wenyu Liu, Xinggang Wang

    Abstract: Segment Anything Model (SAM) has attracted widespread attention for its superior interactive segmentation capabilities with visual prompts while lacking further exploration of text prompts. In this paper, we empirically investigate what text prompt encoders (e.g., CLIP or LLM) are good for adapting SAM for referring expression segmentation and introduce the Early Vision-language Fusion-based SAM (… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Preprint

  2. arXiv:2405.16988  [pdf, other

    cs.NI

    An experimental study of the response time in an edge-cloud continuum with ClusterLink

    Authors: Marc Michalke, Fin Gentzen, Admela Jukan, Kfir Toledo, Etai Lev Ran

    Abstract: In this paper, we conduct an experimental study to provide a general sense of the application response time implications that inter-cluster communication experiences at the edge at the example of a specific IoT-edge-cloud contiuum solution from the EU Project ICOS called ClusterLink. We create an environment to emulate different networking topologies that include multiple cloud or edge sites scena… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Submitted to IECCONT workshop co-hosted with Euro-Par 2024 https://2024.euro-par.org/workshops/workshops/

  3. arXiv:2403.02784  [pdf, other

    cs.CV

    DDF: A Novel Dual-Domain Image Fusion Strategy for Remote Sensing Image Semantic Segmentation with Unsupervised Domain Adaptation

    Authors: Lingyan Ran, Lushuang Wang, Tao Zhuo, Yinghui Xing

    Abstract: Semantic segmentation of remote sensing images is a challenging and hot issue due to the large amount of unlabeled data. Unsupervised domain adaptation (UDA) has proven to be advantageous in incorporating unclassified information from the target domain. However, independently fine-tuning UDA models on the source and target domains has a limited effect on the outcome. This paper proposes a hybrid t… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  4. arXiv:2403.01909  [pdf, other

    cs.CV cs.AI

    Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey

    Authors: Lingyan Ran, Yali Li, Guoqiang Liang, Yanning Zhang

    Abstract: Semantic segmentation is an important and popular research area in computer vision that focuses on classifying pixels in an image based on their semantics. However, supervised deep learning requires large amounts of data to train models and the process of labeling images pixel by pixel is time-consuming and laborious. This review aims to provide a first comprehensive and organized overview of the… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  5. arXiv:2402.03669  [pdf, other

    cs.GT cs.MA

    Distributed Generalized Nash Equilibria Seeking Algorithms Involving Synchronous and Asynchronous Schemes

    Authors: Huaqing Li, Liang Ran, Lifeng Zheng, Zhe Li, **hui Hu, Jun Li, Tingwen Huang

    Abstract: This paper considers a class of noncooperative games in which the feasible decision sets of all players are coupled together by a coupled inequality constraint. Adopting the variational inequality formulation of the game, we first introduce a new local edge-based equilibrium condition and develop a distributed primal-dual proximal algorithm with full information. Considering challenges when commun… ▽ More

    Submitted 11 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 13 pages, 2 figures

  6. arXiv:2312.02238  [pdf, other

    cs.CV cs.AI cs.MM

    X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

    Authors: Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou

    Abstract: We introduce X-Adapter, a universal upgrader to enable the pretrained plug-and-play modules (e.g., ControlNet, LoRA) to work directly with the upgraded text-to-image diffusion model (e.g., SDXL) without further retraining. We achieve this goal by training an additional network to control the frozen upgraded model with the new text-image data pairs. In detail, X-Adapter keeps a frozen copy of the o… ▽ More

    Submitted 23 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Project page: https://showlab.github.io/X-Adapter/

  7. arXiv:2310.09883  [pdf, other

    cs.CV cs.RO

    Zero-Shot Object Goal Visual Navigation With Class-Independent Relationship Network

    Authors: Xinting Li, Shiguang Zhang, Yue LU, Kerry Dang, Lingyan Ran

    Abstract: This paper investigates the zero-shot object goal visual navigation problem. In the object goal visual navigation task, the agent needs to locate navigation targets from its egocentric visual input. "Zero-shot" means that the target the agent needs to find is not trained during the training phase. To address the issue of coupling navigation ability with target features during training, we propose… ▽ More

    Submitted 14 March, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

    ACM Class: I.2.9; I.2.10

  8. arXiv:2309.15818  [pdf, other

    cs.CV

    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

    Authors: David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou

    Abstract: Significant advancements have been achieved in the realm of large-scale pre-trained text-to-video Diffusion Models (VDMs). However, previous methods either rely solely on pixel-based VDMs, which come with high computational costs, or on latent-based VDMs, which often struggle with precise text-video alignment. In this paper, we are the first to propose a hybrid model, dubbed as Show-1, which marri… ▽ More

    Submitted 17 October, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: project page is https://showlab.github.io/Show-1

  9. arXiv:2307.10685  [pdf, other

    cs.CV

    Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

    Authors: Yinghui Xing, Dexuan Kong, Shizhou Zhang, Geng Chen, Lingyan Ran, Peng Wang, Yanning Zhang

    Abstract: Camouflaged object detection (COD), aiming to segment camouflaged objects which exhibit similar patterns with the background, is a challenging task. Most existing works are dedicated to establishing specialized modules to identify camouflaged objects with complete and fine details, while the boundary can not be well located for the lack of object-related semantics. In this paper, we propose a nove… ▽ More

    Submitted 22 August, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

  10. A Novel Rapid-flooding Approach with Real-time Delay Compensation for Wireless Sensor Network Time Synchronization

    Authors: Fanrong Shi, Simon X. Yang, Xianguo Tuo, Lili Ran, Yuqing Huang

    Abstract: One-way-broadcast based flooding time synchronization algorithms are commonly used in wireless sensor networks (WSNs). However, the packet delay and clock drift pose challenges to accuracy, as they entail serious by-hop error accumulation problems in the WSNs. To overcome it, a rapid flooding multi-broadcast time synchronization with real-time delay compensation (RDC-RMTS) is proposed in this pape… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: 14 pages, 11 figures,

    Journal ref: IEEE TRANSACTIONS ON CYBERNETICS, VOL. 52, NO. 3, MARCH 2022

  11. arXiv:2111.15050  [pdf, other

    cs.CV

    AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant

    Authors: Stan Weixian Lei, Difei Gao, Yuxuan Wang, Dongxing Mao, Zihan Liang, Lingmin Ran, Mike Zheng Shou

    Abstract: It is still a pipe dream that personal AI assistants on the phone and AR glasses can assist our daily life in addressing our questions like ``how to adjust the date for this watch?'' and ``how to set its heating duration? (while pointing at an oven)''. The queries used in conventional tasks (i.e. Video Question Answering, Video Retrieval, Moment Localization) are often factoid and based on pure te… ▽ More

    Submitted 10 October, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: 20 pages, 12 figures

  12. arXiv:2104.01526  [pdf, other

    cs.CV

    Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

    Authors: Xinggang Wang, Jiapei Feng, Bin Hu, Qi Ding, Long** Ran, Xiaoxin Chen, Wenyu Liu

    Abstract: Humans have a strong class-agnostic object segmentation ability and can outline boundaries of unknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based solution for weakly-supervised instance segmentation. The BoxCaseg model is jointly trained using box-supervised images and salient images in a multi-task learning manner. The fine… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

    Journal ref: CVPR 2021