Skip to main content

Showing 1–5 of 5 results for author: Castells, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.11936  [pdf, other

    cs.LG cs.AI cs.CV

    LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights

    Authors: Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi

    Abstract: Latent Diffusion Models (LDMs) have emerged as powerful generative models, known for delivering remarkable results under constrained computational resources. However, deploying LDMs on resource-limited devices remains a complex issue, presenting challenges such as memory consumption and inference speed. To address this issue, we introduce LD-Pruner, a novel performance-preserving structured prunin… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 8 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

  2. arXiv:2404.11925  [pdf, other

    cs.LG cs.AI cs.CV

    EdgeFusion: On-Device Text-to-Image Generation

    Authors: Thibault Castells, Hyoung-Kyu Song, Tairen Piao, Shinkook Choi, Bo-Kyeong Kim, Hanyoung Yim, Changgwun Lee, Jae Gon Kim, Tae-Ho Kim

    Abstract: The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application. To tackle this challenge, recent research focuses on methods to reduce sampling steps, such as Latent Consistency Model (LCM), and on employing architectural optimizations, including pruning and knowledge distillation. Diverging from existing approaches… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 4 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

  3. arXiv:2402.02834  [pdf, other

    cs.LG cs.CL

    Shortened LLaMA: Depth Pruning for Large Language Models with Comparison of Retraining Methods

    Authors: Bo-Kyeong Kim, Geonmin Kim, Tae-Ho Kim, Thibault Castells, Shinkook Choi, Junho Shin, Hyoung-Kyu Song

    Abstract: Structured pruning of modern large language models (LLMs) has emerged as a way of decreasing their high computational needs. Width pruning reduces the size of projection weight matrices (e.g., by removing attention heads) while maintaining the number of layers. Depth pruning, in contrast, removes entire layers or blocks, while kee** the size of the remaining weights unchanged. Most current resea… ▽ More

    Submitted 23 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Update (arXiv-v2): continued pretraining for severe pruning ratios, compatibility with quantization, and enhanced baselines. Preliminary work (arXiv-v1) accepted at ICLR 2024 Workshop on ME-FoMo: https://openreview.net/forum?id=18VGxuOdpu

  4. arXiv:2305.15798  [pdf, other

    cs.LG

    BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion

    Authors: Bo-Kyeong Kim, Hyoung-Kyu Song, Thibault Castells, Shinkook Choi

    Abstract: Text-to-image (T2I) generation with Stable Diffusion models (SDMs) involves high computing demands due to billion-scale parameters. To enhance efficiency, recent studies have reduced sampling steps and applied network quantization while retaining the original architectures. The lack of architectural reduction attempts may stem from worries over expensive retraining for such massive models. In this… ▽ More

    Submitted 16 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Updated results; Preliminary version at ICML Workshop on ES-FoMo (2023): https://openreview.net/forum?id=bOVydU0XKC

  5. arXiv:2111.09635  [pdf, other

    cs.CV

    Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy

    Authors: Thibault Castells, Seul-Ki Yeom

    Abstract: Neural networks performance has been significantly improved in the last few years, at the cost of an increasing number of floating point operations per second (FLOPs). However, more FLOPs can be an issue when computational resources are limited. As an attempt to solve this problem, pruning filters is a common solution, but most existing pruning methods do not preserve the model accuracy efficientl… ▽ More

    Submitted 7 December, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

    Comments: 11 pages, 6 figures, 5 tables, accepted in AAAI2023 Workshop (Practical AI)