Skip to main content

Showing 1–5 of 5 results for author: Xuan, S

.
  1. arXiv:2406.04659  [pdf, other

    cs.CV

    LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model

    Authors: Dongkai Wang, Shiyu Xuan, Shiliang Zhang

    Abstract: The capacity of existing human keypoint localization models is limited by keypoint priors provided by the training data. To alleviate this restriction and pursue more general model, this work studies keypoint localization from a different perspective by reasoning locations based on keypiont clues in text descriptions. We propose LocLLM, the first Large-Language Model (LLM) based keypoint localizat… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: CVPR2024

  2. arXiv:2403.06151  [pdf, other

    cs.CV

    Decoupled Contrastive Learning for Long-Tailed Recognition

    Authors: Shiyu Xuan, Shiliang Zhang

    Abstract: Supervised Contrastive Loss (SCL) is popular in visual representation learning. Given an anchor image, SCL pulls two types of positive samples, i.e., its augmentation and other images from the same class together, while pushes negative images apart to optimize the learned embedding. In the scenario of long-tailed recognition, where the number of samples in each class is imbalanced, treating two ty… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted by AAAI 2024

  3. arXiv:2310.00582  [pdf, other

    cs.CV cs.AI

    Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs

    Authors: Shiyu Xuan, Qingpei Guo, Ming Yang, Shiliang Zhang

    Abstract: Multi-modal Large Language Models (MLLMs) have shown remarkable capabilities in various multi-modal tasks. Nevertheless, their performance in fine-grained image understanding tasks is still limited. To address this issue, this paper proposes a new framework to enhance the fine-grained image understanding abilities of MLLMs. Specifically, we present a new method for constructing the instruction tun… ▽ More

    Submitted 12 March, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  4. arXiv:2103.11658  [pdf, other

    cs.CV cs.AI

    Intra-Inter Camera Similarity for Unsupervised Person Re-Identification

    Authors: Shiyu Xuan, Shiliang Zhang

    Abstract: Most of unsupervised person Re-Identification (Re-ID) works produce pseudo-labels by measuring the feature similarity without considering the distribution discrepancy among cameras, leading to degraded accuracy in label computation across cameras. This paper targets to address this challenge by studying a novel intra-inter camera similarity for pseudo-label generation. We decompose the sample simi… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: CVPR2021

  5. arXiv:2101.09428  [pdf, other

    cs.LG cs.DC

    Vertical federated learning based on DFP and BFGS

    Authors: Song WenJie, Shen Xuan

    Abstract: As data privacy is gradually valued by people, federated learning(FL) has emerged because of its potential to protect data. FL uses homomorphic encryption and differential privacy encryption on the promise of ensuring data security to realize distributed machine learning by exchanging encrypted information between different data providers. However, there are still many problems in FL, such as the… ▽ More

    Submitted 23 January, 2021; originally announced January 2021.