Skip to main content

Showing 1–15 of 15 results for author: Foo, L G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.12663  [pdf, other

    cs.GR cs.CV

    LAGA: Layered 3D Avatar Generation and Customization via Gaussian Splatting

    Authors: Jia Gong, Shenyu Ji, Lin Geng Foo, Kang Chen, Hossein Rahmani, Jun Liu

    Abstract: Creating and customizing a 3D clothed avatar from textual descriptions is a critical and challenging task. Traditional methods often treat the human body and clothing as inseparable, limiting users' ability to freely mix and match garments. In response to this limitation, we present LAyered Gaussian Avatar (LAGA), a carefully designed framework enabling the creation of high-fidelity decomposable a… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  2. arXiv:2404.01051  [pdf, other

    cs.CV

    Action Detection via an Image Diffusion Process

    Authors: Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Jun Liu

    Abstract: Action detection aims to localize the starting and ending points of action instances in untrimmed videos, and predict the classes of those instances. In this paper, we make the observation that the outputs of the action detection task can be formulated as images. Thus, from a novel perspective, we tackle action detection via a three-image generation process to generate starting point, ending point… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  3. arXiv:2404.00925  [pdf, other

    cs.CV cs.CL

    LLMs are Good Sign Language Translators

    Authors: Jia Gong, Lin Geng Foo, Yixuan He, Hossein Rahmani, Jun Liu

    Abstract: Sign Language Translation (SLT) is a challenging task that aims to translate sign videos into spoken language. Inspired by the strong translation capabilities of large language models (LLMs) that are trained on extensive multilingual text corpora, we aim to harness off-the-shelf LLMs to handle SLT. In this paper, we regularize the sign videos to embody linguistic characteristics of spoken language… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  4. arXiv:2308.14177  [pdf, other

    cs.CV

    AI-Generated Content (AIGC) for Various Data Modalities: A Survey

    Authors: Lin Geng Foo, Hossein Rahmani, Jun Liu

    Abstract: AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and other media using AI algorithms. Due to its wide range of applications and the demonstrated potential of recent works, AIGC developments have been attracting lots of attention recently, and AIGC methods have been developed for various data modalities, such as image, video, text, 3D shape (as voxels, point cloud… ▽ More

    Submitted 21 October, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

  5. arXiv:2308.13369  [pdf, other

    cs.CV

    Distribution-Aligned Diffusion for Human Mesh Recovery

    Authors: Lin Geng Foo, Jia Gong, Hossein Rahmani, Jun Liu

    Abstract: Recovering a 3D human mesh from a single RGB image is a challenging task due to depth ambiguity and self-occlusion, resulting in a high degree of uncertainty. Meanwhile, diffusion models have recently seen much success in generating high-quality outputs by progressively denoising noisy inputs. Inspired by their capability, we explore a diffusion-based approach for human mesh recovery, and propose… ▽ More

    Submitted 24 October, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV 2023

  6. arXiv:2304.06724  [pdf, other

    cs.CR cs.CV cs.LG

    GradMDM: Adversarial Attack on Dynamic Networks

    Authors: Jianhong Pan, Lin Geng Foo, Qichen Zheng, Zhipeng Fan, Hossein Rahmani, Qiuhong Ke, Jun Liu

    Abstract: Dynamic neural networks can greatly reduce computation redundancy without compromising accuracy by adapting their structures based on the input. In this paper, we explore the robustness of dynamic neural networks against energy-oriented attacks targeted at reducing their efficiency. Specifically, we attack dynamic models with our novel algorithm GradMDM. GradMDM is a technique that adjusts the dir… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

    Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  7. arXiv:2304.04175  [pdf, other

    cs.CV

    Token Boosting for Robust Self-Supervised Visual Transformer Pre-training

    Authors: Tianjiao Li, Lin Geng Foo, ** Hu, Xindi Shang, Hossein Rahmani, Zehuan Yuan, Jun Liu

    Abstract: Learning with large-scale unlabeled data has become a powerful tool for pre-training Visual Transformers (VTs). However, prior works tend to overlook that, in real-world scenarios, the input data may be corrupted and unreliable. Pre-training VTs on such corrupted data can be challenging, especially when we pre-train via the masked autoencoding approach, where both the inputs and masked ``ground tr… ▽ More

    Submitted 12 April, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  8. arXiv:2304.00280  [pdf, other

    cs.CV

    Progressive Channel-Shrinking Network

    Authors: Jianhong Pan, Siyuan Yang, Lin Geng Foo, Qiuhong Ke, Hossein Rahmani, Zhipeng Fan, Jun Liu

    Abstract: Currently, salience-based channel pruning makes continuous breakthroughs in network compression. In the realization, the salience mechanism is used as a metric of channel salience to guide pruning. Therefore, salience-based channel pruning can dynamically adjust the channel width at run-time, which provides a flexible pruning scheme. However, there are two problems emerging: a gating function is o… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

  9. arXiv:2303.15742  [pdf, other

    cs.CV

    System-status-aware Adaptive Network for Online Streaming Video Understanding

    Authors: Lin Geng Foo, Jia Gong, Zhipeng Fan, Jun Liu

    Abstract: Recent years have witnessed great progress in deep neural networks for real-time applications. However, most existing works do not explicitly consider the general case where the device's state and the available resources fluctuate over time, and none of them investigate or address the impact of varying computational resources for online video understanding tasks. This paper proposes a System-statu… ▽ More

    Submitted 9 April, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

    Comments: Accepted to CVPR 2023

  10. arXiv:2211.16940  [pdf, other

    cs.CV

    DiffPose: Toward More Reliable 3D Pose Estimation

    Authors: Jia Gong, Lin Geng Foo, Zhipeng Fan, Qiuhong Ke, Hossein Rahmani, Jun Liu

    Abstract: Monocular 3D human pose estimation is quite challenging due to the inherent ambiguity and occlusion, which often lead to high uncertainty and indeterminacy. On the other hand, diffusion models have recently emerged as an effective tool for generating high-quality images from noise. Inspired by their capability, we explore a novel pose estimation framework (DiffPose) that formulates 3D pose estimat… ▽ More

    Submitted 9 April, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted to CVPR 2023

  11. arXiv:2210.06776  [pdf, other

    cs.CV

    Improving the Reliability for Confidence Estimation

    Authors: Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu

    Abstract: Confidence estimation, a task that aims to evaluate the trustworthiness of the model's prediction output during deployment, has received lots of research attention recently, due to its importance for the safe deployment of deep models. Previous works have outlined two important qualities that a reliable confidence estimation model should possess, i.e., the ability to perform well under label imbal… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted by ECCV 2022

  12. arXiv:2210.00740  [pdf, other

    cs.CV

    Heatmap Distribution Matching for Human Pose Estimation

    Authors: Haoxuan Qu, Li Xu, Yujun Cai, Lin Geng Foo, Jun Liu

    Abstract: For tackling the task of 2D human pose estimation, the great majority of the recent methods regard this task as a heatmap estimation problem, and optimize the heatmap prediction using the Gaussian-smoothed heatmap as the optimization objective and using the pixel-wise loss (e.g. MSE) as the loss function. In this paper, we show that optimizing the heatmap prediction in such a way, the model perfor… ▽ More

    Submitted 3 October, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  13. arXiv:2209.01425  [pdf, other

    cs.CV

    Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition

    Authors: Tianjiao Li, Lin Geng Foo, Qiuhong Ke, Hossein Rahmani, Anran Wang, **ghua Wang, Jun Liu

    Abstract: The goal of fine-grained action recognition is to successfully discriminate between action categories with subtle differences. To tackle this, we derive inspiration from the human visual system which contains specialized regions in the brain that are dedicated towards handling specific tasks. We design a novel Dynamic Spatio-Temporal Specialization (DSTS) module, which consists of specialized neur… ▽ More

    Submitted 3 September, 2022; originally announced September 2022.

    Comments: Accepted to ECCV 2022

  14. arXiv:2207.09675  [pdf, other

    cs.CV

    ERA: Expert Retrieval and Assembly for Early Action Prediction

    Authors: Lin Geng Foo, Tianjiao Li, Hossein Rahmani, Qiuhong Ke, Jun Liu

    Abstract: Early action prediction aims to successfully predict the class label of an action before it is completely performed. This is a challenging task because the beginning stages of different actions can be very similar, with only minor subtle differences for discrimination. In this paper, we propose a novel Expert Retrieval and Assembly (ERA) module that retrieves and assembles a set of experts most sp… ▽ More

    Submitted 22 July, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022

  15. arXiv:2007.10817  [pdf, other

    cs.CV cs.LG eess.IV

    Split and Expand: An inference-time improvement for Weakly Supervised Cell Instance Segmentation

    Authors: Lin Geng Foo, Rui En Ho, Jiamei Sun, Alexander Binder

    Abstract: We consider the problem of segmenting cell nuclei instances from Hematoxylin and Eosin (H&E) stains with weak supervision. While most recent works focus on improving the segmentation quality, this is usually insufficient for instance segmentation of cell instances clumped together or with a small size. In this work, we propose a two-step post-processing procedure, Split and Expand, that directly i… ▽ More

    Submitted 14 March, 2022; v1 submitted 21 July, 2020; originally announced July 2020.