Skip to main content

Showing 1–21 of 21 results for author: Lian, L

.
  1. arXiv:2405.03851  [pdf, other

    cs.DB cs.DS

    Querying in Constant Time with Learned Indexes

    Authors: Luis Croquevielle, Guang Yang, Liang Lian, Ali Hadian, Thomas Heinis

    Abstract: Learned indexes leverage machine learning models to accelerate query answering in databases, showing impressive practical performance. However, theoretical understanding of these methods remains incomplete. Existing research suggests that learned indexes have superior asymptotic complexity compared to their non-learned counterparts, but these findings have been established under restrictive probab… ▽ More

    Submitted 13 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  2. arXiv:2401.14391  [pdf, other

    cs.CV

    Rethinking Patch Dependence for Masked Autoencoders

    Authors: Letian Fu, Long Lian, Renhao Wang, Baifeng Shi, Xudong Wang, Adam Yala, Trevor Darrell, Alexei A. Efros, Ken Goldberg

    Abstract: In this work, we re-examine inter-patch dependencies in the decoding mechanism of masked autoencoders (MAE). We decompose this decoding mechanism for masked patch reconstruction in MAE into self-attention and cross-attention. Our investigations suggest that self-attention between mask patches is not essential for learning good representations. To this end, we propose a novel pretraining framework:… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  3. arXiv:2312.17243  [pdf, other

    cs.CV

    Unsupervised Universal Image Segmentation

    Authors: Dantong Niu, Xudong Wang, Xinyang Han, Long Lian, Roei Herzig, Trevor Darrell

    Abstract: Several unsupervised image segmentation approaches have been proposed which eliminate the need for dense manually-annotated segmentation masks; current models separately handle either semantic segmentation (e.g., STEGO) or class-agnostic instance segmentation (e.g., CutLER), but not both (i.e., panoptic segmentation). We propose an Unsupervised Universal Segmentation model (U2Seg) adept at perform… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  4. arXiv:2311.16090  [pdf, other

    cs.CV

    Self-correcting LLM-controlled Diffusion Models

    Authors: Tsung-Han Wu, Long Lian, Joseph E. Gonzalez, Boyi Li, Trevor Darrell

    Abstract: Text-to-image generation has witnessed significant progress with the advent of diffusion models. Despite the ability to generate photorealistic images, current text-to-image diffusion models still often struggle to accurately interpret and follow complex input text prompts. In contrast to existing models that aim to generate images only with their best effort, we introduce Self-correcting LLM-cont… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 16 pages, 10 figures

  5. arXiv:2309.17444  [pdf, other

    cs.CV cs.AI cs.CL

    LLM-grounded Video Diffusion Models

    Authors: Long Lian, Baifeng Shi, Adam Yala, Trevor Darrell, Boyi Li

    Abstract: Text-conditioned diffusion models have emerged as a promising tool for neural video generation. However, current models still struggle with intricate spatiotemporal prompts and often generate restricted or incorrect motion. To address these limitations, we introduce LLM-grounded Video Diffusion (LVD). Instead of directly generating videos from the text inputs, LVD first leverages a large language… ▽ More

    Submitted 4 May, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: ICLR 2024. Project Page: https://llm-grounded-video-diffusion.github.io/

  6. arXiv:2305.13655  [pdf, other

    cs.CV

    LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

    Authors: Long Lian, Boyi Li, Adam Yala, Trevor Darrell

    Abstract: Recent advancements in text-to-image diffusion models have yielded impressive results in generating realistic and diverse images. However, these models still struggle with complex prompts, such as those that involve numeracy and spatial reasoning. This work proposes to enhance prompt understanding capabilities in diffusion models. Our method leverages a pretrained large language model (LLM) for gr… ▽ More

    Submitted 4 March, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Transactions on Machine Learning Research (TMLR) 2024, with Featured Certification

  7. arXiv:2304.08025  [pdf, other

    cs.CV

    Bootstrap** Objectness from Videos by Relaxed Common Fate and Visual Grou**

    Authors: Long Lian, Zhirong Wu, Stella X. Yu

    Abstract: We study learning object segmentation from unlabeled videos. Humans can easily segment moving objects without knowing what they are. The Gestalt law of common fate, i.e., what move at the same speed belong together, has inspired unsupervised object discovery based on motion segmentation. However, common fate is not a reliable indicator of objectness: Parts of an articulated / deformable object may… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023. An extension of preprint 2212.08816. 19 pages, 11 figures

  8. arXiv:2303.02960  [pdf, other

    eess.SP

    Adaptive Multi-User Channel Estimation Based on Contrastive Feature Learning

    Authors: Yihan Xu, Lixiang Lian

    Abstract: Correlation exploitation is essential for efficient multi-user channel estimation (MUCE) in massive MIMO systems. However, the existing works either rely on presumed strong correlation or learn the correlation through large amount of labeled data, which are difficult to acquire in a real system. In this paper, we propose an adaptive MUCE algorithm based on contrastive feature learning. The contras… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  9. arXiv:2302.04304  [pdf, other

    cs.CV cs.LG

    Q-Diffusion: Quantizing Diffusion Models

    Authors: Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel Kang, Shanghang Zhang, Kurt Keutzer

    Abstract: Diffusion models have achieved great success in image synthesis through iterative noise estimation using deep neural networks. However, the slow inference, high memory consumption, and computation intensity of the noise estimation model hinder the efficient adoption of diffusion models. Although post-training quantization (PTQ) is considered a go-to compression method for other tasks, it does not… ▽ More

    Submitted 8 June, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: The code is available at https://github.com/Xiuyu-Li/q-diffusion

  10. arXiv:2301.03377  [pdf, other

    eess.SP cs.LG cs.NI

    Machine Learning for Large-Scale Optimization in 6G Wireless Networks

    Authors: Yandong Shi, Lixiang Lian, Yuanming Shi, Zixin Wang, Yong Zhou, Liqun Fu, Lin Bai, Jun Zhang, Wei Zhang

    Abstract: The sixth generation (6G) wireless systems are envisioned to enable the paradigm shift from "connected things" to "connected intelligence", featured by ultra high density, large-scale, dynamic heterogeneity, diversified functional requirements and machine learning capabilities, which leads to a growing need for highly efficient intelligent algorithms. The classic optimization-based algorithms usua… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  11. arXiv:2212.08816  [pdf, other

    cs.CV

    Improving Unsupervised Video Object Segmentation with Motion-Appearance Synergy

    Authors: Long Lian, Zhirong Wu, Stella X. Yu

    Abstract: We present IMAS, a method that segments the primary objects in videos without manual annotation in training or inference. Previous methods in unsupervised video object segmentation (UVOS) have demonstrated the effectiveness of motion as either input or supervision for segmentation. However, motion signals may be uninformative or even misleading in cases such as deformable objects and objects with… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

    Comments: 15 pages, 10 figures

  12. arXiv:2211.03628  [pdf, other

    cs.LG cs.DC eess.SP

    Decentralized Complete Dictionary Learning via $\ell^{4}$-Norm Maximization

    Authors: Qiheng Lu, Lixiang Lian

    Abstract: With the rapid development of information technologies, centralized data processing is subject to many limitations, such as computational overheads, communication delays, and data privacy leakage. Decentralized data processing over networked terminal nodes becomes an important technology in the era of big data. Dictionary learning is a powerful representation learning method to exploit the low-dim… ▽ More

    Submitted 26 November, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

  13. arXiv:2208.03928  [pdf, ps, other

    cs.IT

    Reconfigurable Intelligent Surfaces Empowered Cooperative Rate Splitting with User Relaying

    Authors: Kangchun Zhao, Yijie Mao, Zhaohui Yang, Lixiang Lian, Bruno Clerckx

    Abstract: Cooperative rate splitting (CRS), built upon rate splitting multiple access (RSMA) and opportunistic user relaying, has been recognized as a promising transmission strategy to enhance the user fairness and spectral efficiency in multiantenna broadcast channels. To further boost its performance, the interplay of CRS and reconfigurable intelligent surface (RIS) is investigated in this work. Specific… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

    Comments: 6 pages,5 figures

  14. arXiv:2206.03596  [pdf, other

    cs.LG cs.CV eess.IV

    Neural Network Compression via Effective Filter Analysis and Hierarchical Pruning

    Authors: Ziqi Zhou, Li Lian, Yilong Yin, Ze Wang

    Abstract: Network compression is crucial to making the deep networks to be more efficient, faster, and generalizable to low-end hardware. Current network compression methods have two open problems: first, there lacks a theoretical framework to estimate the maximum compression rate; second, some layers may get over-prunned, resulting in significant network performance drop. To solve these two problems, this… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  15. arXiv:2205.04227  [pdf

    eess.IV cs.CV

    Mixed-UNet: Refined Class Activation Map** for Weakly-Supervised Semantic Segmentation with Multi-scale Inference

    Authors: Yang Liu, Ersi Zhang, Lulu Xu, Chufan Xiao, Xiaoyun Zhong, Li** Lian, Fang Li, Bin Jiang, Yuhan Dong, Lan Ma, Qiming Huang, Ming Xu, Yongbing Zhang, Dongmei Yu, Chenggang Yan, Peiwu Qin

    Abstract: Deep learning techniques have shown great potential in medical image processing, particularly through accurate and reliable image segmentation on magnetic resonance imaging (MRI) scans or computed tomography (CT) scans, which allow the localization and diagnosis of lesions. However, training these segmentation models requires a large number of manually annotated pixel-level labels, which are time-… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: 12 pages, 7 figures

  16. arXiv:2204.03329  [pdf

    cs.RO eess.SY

    Information-driven Path Planning for Hybrid Aerial Underwater Vehicles

    Authors: Zheng Zeng, Chengke Xiong, Xinyi Yuan, Yulin Bai, Yufei **, Di Lu, Lian Lian

    Abstract: This paper presents a novel Rapidly-exploring Adaptive Sampling Tree (RAST) algorithm for the adaptive sampling mission of a hybrid aerial underwater vehicle (HAUV) in an air-sea 3D environment. This algorithm innovatively combines the tournament-based point selection sampling strategy, the information heuristic search process and the framework of Rapidly-exploring Random Tree (RRT) algorithm. Hen… ▽ More

    Submitted 8 April, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

  17. arXiv:2201.01490  [pdf, other

    cs.LG cs.CL cs.CV

    Debiased Learning from Naturally Imbalanced Pseudo-Labels

    Authors: Xudong Wang, Zhirong Wu, Long Lian, Stella X. Yu

    Abstract: Pseudo-labels are confident predictions made on unlabeled target data by a classifier trained on labeled source data. They are widely used for adapting a model to unlabeled data, e.g., in a semi-supervised learning setting. Our key insight is that pseudo-labels are naturally imbalanced due to intrinsic data similarity, even when a model is trained on balanced source data and evaluated on balance… ▽ More

    Submitted 21 April, 2022; v1 submitted 5 January, 2022; originally announced January 2022.

    Comments: Accepted by CVPR 2022

  18. arXiv:2111.00470  [pdf, other

    eess.SP

    Wireless Federated Learning over MIMO Networks: Joint Device Scheduling and Beamforming Design

    Authors: Shaoming Huang, Pengfei Zhang, Yijie Mao, Lixiang Lian, Yuanming Shi

    Abstract: Federated learning (FL) is recognized as a key enabling technology to support distributed artificial intelligence (AI) services in future 6G. By supporting decentralized data training and collaborative model training among devices, FL inherently tames privacy leakage and reduces transmission costs. Whereas, the performance of the wireless FL is typically restricted by the communication latency. Mu… ▽ More

    Submitted 31 October, 2021; originally announced November 2021.

    Comments: Submitted to 2022 IEEE International Conference on Communications

  19. arXiv:2110.03006  [pdf, other

    cs.LG cs.CV

    Unsupervised Selective Labeling for More Effective Semi-Supervised Learning

    Authors: Xudong Wang, Long Lian, Stella X. Yu

    Abstract: Given an unlabeled dataset and an annotation budget, we study how to selectively label a fixed number of instances so that semi-supervised learning (SSL) on such a partially labeled dataset is most effective. We focus on selecting the right data to label, in addition to usual SSL's propagating labels from labeled data to the rest unlabeled data. This instance selection task is challenging, as with… ▽ More

    Submitted 23 August, 2023; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Accepted by ECCV 2022; Fixed a few typos

  20. arXiv:2104.02921  [pdf, other

    cs.RO cs.CV

    Unsupervised Visual Attention and Invariance for Reinforcement Learning

    Authors: Xudong Wang, Long Lian, Stella X. Yu

    Abstract: Vision-based reinforcement learning (RL) is successful, but how to generalize it to unknown test environments remains challenging. Existing methods focus on training an RL policy that is universal to changing visual domains, whereas we focus on extracting visual foreground that is universal, feeding clean invariant vision to the RL policy learner. Our method is completely unsupervised, without man… ▽ More

    Submitted 16 April, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted at CVPR 2021

  21. arXiv:2010.01809  [pdf, other

    cs.CV

    Long-tailed Recognition by Routing Diverse Distribution-Aware Experts

    Authors: Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu

    Abstract: Natural data are often long-tail distributed over semantic classes. Existing recognition methods tackle this imbalanced classification by placing more emphasis on the tail data, through class re-balancing/re-weighting or ensembling over different data groups, resulting in increased tail accuracies but reduced head accuracies. We take a dynamic view of the training data and provide a principled m… ▽ More

    Submitted 1 May, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted at ICLR 2021 (Spotlight); Add experiments on Swin Transformer