Skip to main content

Showing 1–26 of 26 results for author: Qu, Q

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.05822  [pdf, other

    cs.LG stat.ML

    Symmetric Matrix Completion with ReLU Sampling

    Authors: Huikang Liu, Peng Wang, Longxiu Huang, Qing Qu, Laura Balzano

    Abstract: We study the problem of symmetric positive semi-definite low-rank matrix completion (MC) with deterministic entry-dependent sampling. In particular, we consider rectified linear unit (ReLU) sampling, where only positive entries are observed, as well as a generalization to threshold-based sampling. We first empirically demonstrate that the landscape of this MC problem is not globally benign: Gradie… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 39 pages, 9 figures; This work has been accepted for publication in the Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  2. arXiv:2406.04112  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

    Authors: Can Yaras, Peng Wang, Laura Balzano, Qing Qu

    Abstract: While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of overparameterization without the… ▽ More

    Submitted 9 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML'24 (Oral)

  3. arXiv:2311.05061  [pdf, other

    cs.LG stat.ML

    Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics

    Authors: Soo Min Kwon, Zekai Zhang, Dogyoon Song, Laura Balzano, Qing Qu

    Abstract: Overparameterized models have proven to be powerful tools for solving various machine learning tasks. However, overparameterization often leads to a substantial increase in computational and memory costs, which in turn requires extensive resources to train. In this work, we present a novel approach for compressing overparameterized models, developed through studying their learning dynamics. We obs… ▽ More

    Submitted 11 March, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  4. arXiv:2212.12206  [pdf, other

    cs.LG cs.AI cs.CV eess.IV stat.ML

    Principled and Efficient Transfer Learning of Deep Models via Neural Collapse

    Authors: Xiao Li, Sheng Liu, **xin Zhou, Xinyu Lu, Carlos Fernandez-Granda, Zhihui Zhu, Qing Qu

    Abstract: As model size continues to grow and access to labeled training data remains limited, transfer learning has become a popular approach in many scientific and engineering fields. This study explores the phenomenon of neural collapse (NC) in transfer learning for classification problems, which is characterized by the last-layer features and classifiers of deep networks having zero within-class variabi… ▽ More

    Submitted 26 February, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

    Comments: First two authors contributed equally, 29 pages, 14 figures, and 7 tables

  5. arXiv:2210.02192  [pdf, other

    cs.LG cs.AI cs.IT math.OC stat.ML

    Are All Losses Created Equal: A Neural Collapse Perspective

    Authors: **xin Zhou, Chong You, Xiao Li, Kangning Liu, Sheng Liu, Qing Qu, Zhihui Zhu

    Abstract: While cross entropy (CE) is the most commonly used loss to train deep neural networks for classification tasks, many alternative losses have been developed to obtain better empirical performance. Among them, which one is the best to use is still a mystery, because there seem to be multiple factors affecting the answer, such as properties of the dataset, the choice of network architecture, and so o… ▽ More

    Submitted 8 October, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 32 page, 10 figures, NeurIPS 2022

  6. arXiv:2209.09211  [pdf, other

    cs.LG cs.CV cs.IT eess.SP stat.ML

    Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

    Authors: Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

    Abstract: When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon. More specifically, for the output features of the penultimate layer, for each class the within-class features converge to their means, and the means of different classes exhibit a certain tight frame structure, which is also… ▽ More

    Submitted 7 March, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: The first two authors contributed to this work equally; 38 pages, 13 figures. Accepted at NeurIPS'22

  7. arXiv:2203.01238  [pdf, other

    cs.LG cs.AI cs.IT math.OC stat.ML

    On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features

    Authors: **xin Zhou, Xiao Li, Tianyu Ding, Chong You, Qing Qu, Zhihui Zhu

    Abstract: When training deep neural networks for classification tasks, an intriguing empirical phenomenon has been widely observed in the last-layer classifiers and features, where (i) the class means and the last-layer classifiers all collapse to the vertices of a Simplex Equiangular Tight Frame (ETF) up to scaling, and (ii) cross-example within-class variability of last-layer activations collapses to zero… ▽ More

    Submitted 12 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

  8. arXiv:2202.14026  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Robust Training under Label Noise by Over-parameterization

    Authors: Sheng Liu, Zhihui Zhu, Qing Qu, Chong You

    Abstract: Recently, over-parameterized deep networks, with increasingly more network parameters than training samples, have dominated the performances of modern machine learning. However, when the training data is corrupted, it has been well-known that over-parameterized networks tend to overfit and do not generalize. In this work, we propose a principled approach for robust training of over-parameterized d… ▽ More

    Submitted 2 August, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: 25 pages, 4 figures and 6 tables. Code is available at https://github.com/shengliu66/SOP

  9. arXiv:2109.11154  [pdf, other

    math.OC cs.LG stat.ML

    Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

    Authors: Lijun Ding, Liwei Jiang, Yudong Chen, Qing Qu, Zhihui Zhu

    Abstract: We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted Gaussian measurements, with no prior knowledge on the intrinsic rank. We consider the robust matrix factorization approach. We employ a robust $\ell_1$ loss function and deal with the challenge of the unknown rank by using an overspecified factored representation of the matrix variable. We then solve the associat… ▽ More

    Submitted 26 October, 2021; v1 submitted 23 September, 2021; originally announced September 2021.

    Comments: 75 pages, 3 figures

  10. arXiv:2105.02375  [pdf, other

    cs.LG cs.AI cs.IT math.OC stat.ML

    A Geometric Analysis of Neural Collapse with Unconstrained Features

    Authors: Zhihui Zhu, Tianyu Ding, **xin Zhou, Xiao Li, Chong You, Jeremias Sulam, Qing Qu

    Abstract: We provide the first global optimization landscape analysis of $Neural\;Collapse$ -- an intriguing empirical phenomenon that arises in the last-layer classifiers and features of neural networks during the terminal phase of training. As recently reported by Papyan et al., this phenomenon implies that ($i$) the class means and the last-layer classifiers all collapse to the vertices of a Simplex Equi… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: 42 pages, 8 figures, 1 table; the first two authors contributed to this work equally

  11. arXiv:2007.06753  [pdf, other

    cs.LG cs.CV cs.IT math.OC stat.ML

    From Symmetry to Geometry: Tractable Nonconvex Problems

    Authors: Yuqian Zhang, Qing Qu, John Wright

    Abstract: As science and engineering have become increasingly data-driven, the role of optimization has expanded to touch almost every stage of the data analysis pipeline, from signal and data acquisition to modeling and prediction. The optimization problems encountered in practice are often nonconvex. While challenges vary from problem to problem, one common source of nonconvexity is nonlinearity in the da… ▽ More

    Submitted 8 July, 2022; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: review paper, 38 pages, 10 figures, revision: correction of typos, adding more discussion on recent advances on deep learning

  12. arXiv:2006.08857  [pdf, other

    cs.LG cs.CV math.OC stat.ML

    Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

    Authors: Chong You, Zhihui Zhu, Qing Qu, Yi Ma

    Abstract: Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for robust low-rank matrix recovery from grossly corrupted measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and s… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  13. arXiv:2001.06970  [pdf, other

    cs.LG cs.IT eess.IV math.OC stat.ML

    Finding the Sparsest Vectors in a Subspace: Theory, Algorithms, and Applications

    Authors: Qing Qu, Zhihui Zhu, Xiao Li, Manolis C. Tsakiris, John Wright, René Vidal

    Abstract: The problem of finding the sparsest vector (direction) in a low dimensional subspace can be considered as a homogeneous variant of the sparse recovery problem, which finds applications in robust subspace recovery, dictionary learning, sparse blind deconvolution, and many other problems in signal processing and machine learning. However, in contrast to the classical sparse recovery problem, the mos… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: QQ and ZZ contributed equally to the work. Invited review paper for IEEE Signal Processing Magazine Special Issue on non-convex optimization for signal processing and machine learning. This article contains 26 pages with 11 figures

  14. arXiv:1912.02427  [pdf, other

    cs.LG cs.IT eess.SP math.OC stat.ML

    Analysis of the Optimization Landscapes for Overcomplete Representation Learning

    Authors: Qing Qu, Yuexiang Zhai, Xiao Li, Yuqian Zhang, Zhihui Zhu

    Abstract: We study nonconvex optimization landscapes for learning overcomplete representations, including learning (i) sparsely used overcomplete dictionaries and (ii) convolutional dictionaries, where these unsupervised learning problems find many applications in high-dimensional data analysis. Despite the empirical success of simple nonconvex algorithms, theoretical justifications of why these methods wor… ▽ More

    Submitted 10 December, 2019; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: 68 pages, 5 figures

  15. arXiv:1908.10959  [pdf, other

    eess.SP cs.LG eess.IV math.OC stat.ML

    Short-and-Sparse Deconvolution -- A Geometric Approach

    Authors: Yenson Lau, Qing Qu, Han-Wen Kuo, Pengcheng Zhou, Yuqian Zhang, John Wright

    Abstract: Short-and-sparse deconvolution (SaSD) is the problem of extracting localized, recurring motifs in signals with spatial or temporal structure. Variants of this problem arise in applications such as image deblurring, microscopy, neural spike sorting, and more. The problem is challenging in both theory and practice, as natural optimization formulations are nonconvex. Moreover, practical deconvolution… ▽ More

    Submitted 1 October, 2019; v1 submitted 28 August, 2019; originally announced August 2019.

    Comments: *YL and QQ contributed equally to this work; 30 figures, 45 pages; This version: added an experiment comparing with other methods, corrected typos and added references

  16. arXiv:1908.10776  [pdf, ps, other

    eess.SP cs.LG eess.IV math.OC stat.ML

    A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution

    Authors: Qing Qu, Xiao Li, Zhihui Zhu

    Abstract: We study the multi-channel sparse blind deconvolution (MCS-BD) problem, whose task is to simultaneously recover a kernel $\mathbf a$ and multiple sparse inputs $\{\mathbf x_i\}_{i=1}^p$ from their circulant convolution $\mathbf y_i = \mathbf a \circledast \mathbf x_i $ ($i=1,\cdots,p$). We formulate the task as a nonconvex optimization problem over the sphere. Under mild statistical assumptions of… ▽ More

    Submitted 29 February, 2020; v1 submitted 28 August, 2019; originally announced August 2019.

    Comments: 62 pages, 6 figures; short version accepted as a spotlight paper at NeurIPS'19 (https://papers.nips.cc/paper/8656-a-nonconvex-approach-for-exact-and-efficient-multichannel-sparse-blind-deconvolution) ; A long journal version is under revision at SIIMS

  17. arXiv:1903.01747  [pdf, other

    cs.LG stat.ML

    Towards Understanding Chinese Checkers with Heuristics, Monte Carlo Tree Search, and Deep Reinforcement Learning

    Authors: Ziyu Liu, Meng Zhou, Weiqing Cao, Qiang Qu, Henry Wing Fung Yeung, Vera Yuk Ying Chung

    Abstract: The game of Chinese Checkers is a challenging traditional board game of perfect information that differs from other traditional games in two main aspects: first, unlike Chess, all checkers remain indefinitely in the game and hence the branching factor of the search tree does not decrease as the game progresses; second, unlike Go, there are also no upper bounds on the depth of the search tree since… ▽ More

    Submitted 8 March, 2019; v1 submitted 5 March, 2019; originally announced March 2019.

  18. N-fold Superposition: Improving Neural Networks by Reducing the Noise in Feature Maps

    Authors: Yang Liu, Qiang Qu, Chao Gao

    Abstract: Considering the use of Fully Connected (FC) layer limits the performance of Convolutional Neural Networks (CNNs), this paper develops a method to improve the coupling between the convolution layer and the FC layer by reducing the noise in Feature Maps (FMs). Our approach is divided into three steps. Firstly, we separate all the FMs into n blocks equally. Then, the weighted summation of FMs at the… ▽ More

    Submitted 3 May, 2018; v1 submitted 22 April, 2018; originally announced April 2018.

    Comments: 7 pages, 5 figures, submitted to ICALIP 2018

    Journal ref: 2018 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, 2018, pp. 450-456

  19. arXiv:1712.00716  [pdf, other

    stat.CO cs.IT math.NA math.OC stat.ML

    Convolutional Phase Retrieval via Gradient Descent

    Authors: Qing Qu, Yuqian Zhang, Yonina C. Eldar, John Wright

    Abstract: We study the convolutional phase retrieval problem, of recovering an unknown signal $\mathbf x \in \mathbb C^n $ from $m$ measurements consisting of the magnitude of its cyclic convolution with a given kernel $\mathbf a \in \mathbb C^m $. This model is motivated by applications such as channel estimation, optics, and underwater acoustic communication, where the signal of interest is acted on by a… ▽ More

    Submitted 5 October, 2019; v1 submitted 3 December, 2017; originally announced December 2017.

    Comments: 64 pages , 9 figures, appeared in NeurIPS 2017. Accepted at IEEE Transactions on Information Theory. This is the final (minor) update: fixed typos and grammar issues

  20. arXiv:1602.06664  [pdf, other

    cs.IT math.OC stat.ML

    A Geometric Analysis of Phase Retrieval

    Authors: Ju Sun, Qing Qu, John Wright

    Abstract: Can we recover a complex signal from its Fourier magnitudes? More generally, given a set of $m$ measurements, $y_k = |\mathbf a_k^* \mathbf x|$ for $k = 1, \dots, m$, is it possible to recover $\mathbf x \in \mathbb{C}^n$ (i.e., length-$n$ complex vector)? This **generalized phase retrieval** (GPR) problem is a fundamental task in various disciplines, and has been the subject of much recent invest… ▽ More

    Submitted 1 January, 2017; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: 61 pages, 5 figures. A short version can be found here http://sunju.org/docs/PR_G4_16.pdf . Revised according to reviewers' feedback

    Journal ref: Foundations of Computational Mathematics, 18(5):1131--1198, 2018

  21. arXiv:1511.04777  [pdf, other

    cs.IT cs.CV math.OC stat.ML

    Complete Dictionary Recovery over the Sphere II: Recovery by Riemannian Trust-region Method

    Authors: Ju Sun, Qing Qu, John Wright

    Abstract: We consider the problem of recovering a complete (i.e., square and invertible) matrix $\mathbf A_0$, from $\mathbf Y \in \mathbb{R}^{n \times p}$ with $\mathbf Y = \mathbf A_0 \mathbf X_0$, provided $\mathbf X_0$ is sufficiently sparse. This recovery problem is central to theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals and fin… ▽ More

    Submitted 1 September, 2016; v1 submitted 15 November, 2015; originally announced November 2015.

    Comments: The second of two papers based on the report arXiv:1504.06785. Accepted by IEEE Transaction on Information Theory; revised according to the reviewers' comments

    Journal ref: IEEE Trans. Information Theory, 63(2): 885 - 914 (2017)

  22. arXiv:1511.03607  [pdf, other

    cs.IT cs.CV math.OC stat.ML

    Complete Dictionary Recovery over the Sphere I: Overview and the Geometric Picture

    Authors: Ju Sun, Qing Qu, John Wright

    Abstract: We consider the problem of recovering a complete (i.e., square and invertible) matrix $\mathbf A_0$, from $\mathbf Y \in \mathbb{R}^{n \times p}$ with $\mathbf Y = \mathbf A_0 \mathbf X_0$, provided $\mathbf X_0$ is sufficiently sparse. This recovery problem is central to theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals and fin… ▽ More

    Submitted 1 September, 2016; v1 submitted 11 November, 2015; originally announced November 2015.

    Comments: Accepted by IEEE Transaction on Information Theory; revised according to the reviewers' comments

    Journal ref: IEEE Trans. Information Theory, 63(2): 853 - 884 (2017)

  23. arXiv:1510.06096  [pdf, other

    math.OC cs.IT stat.ML

    When Are Nonconvex Problems Not Scary?

    Authors: Ju Sun, Qing Qu, John Wright

    Abstract: In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional curvature. Concrete applications such as dictionary learning, generalized phase retrieval, and orthogonal tensor decomposition are known to induce such structures. We describe a second-orde… ▽ More

    Submitted 22 April, 2016; v1 submitted 20 October, 2015; originally announced October 2015.

    Comments: 6 pages, 3 figures. New examples on phase synchronization and community detection added; emphasis on all local minimizers being global added; exposition is polished. This is a concise expository article that avoids much technical rigor. We will make a separate submission with full technical details in future

  24. arXiv:1504.06785  [pdf, other

    cs.IT cs.CV cs.LG math.OC stat.ML

    Complete Dictionary Recovery over the Sphere

    Authors: Ju Sun, Qing Qu, John Wright

    Abstract: We consider the problem of recovering a complete (i.e., square and invertible) matrix $\mathbf A_0$, from $\mathbf Y \in \mathbb R^{n \times p}$ with $\mathbf Y = \mathbf A_0 \mathbf X_0$, provided $\mathbf X_0$ is sufficiently sparse. This recovery problem is central to the theoretical understanding of dictionary learning, which seeks a sparse representation for a collection of input signals, and… ▽ More

    Submitted 17 November, 2015; v1 submitted 26 April, 2015; originally announced April 2015.

    Comments: 104 pages, 5 figures. Due to length constraint of publication, this long paper are subsequently divided into two papers (arXiv:1511.03607 and arXiv:1511.04777). Further updates will be made only to the two papers

    MSC Class: 68P30; 58C05; 94A12; 94A08; 68T05; 90C26; 90C48; 90C55

  25. arXiv:1412.4659  [pdf, other

    cs.IT cs.CV cs.LG math.OC stat.ML

    Finding a sparse vector in a subspace: Linear sparsity using alternating directions

    Authors: Qing Qu, Ju Sun, John Wright

    Abstract: Is it possible to find the sparsest vector (direction) in a generic subspace $\mathcal{S} \subseteq \mathbb{R}^p$ with $\mathrm{dim}(\mathcal{S})= n < p$? This problem can be considered a homogeneous variant of the sparse recovery problem, and finds connections to sparse dictionary learning, sparse PCA, and many other problems in signal processing and machine learning. In this paper, we focus on a… ▽ More

    Submitted 19 July, 2016; v1 submitted 15 December, 2014; originally announced December 2014.

    Comments: Accepted by IEEE Trans. Information Theory. The paper has been revised by the reviewers' comments. The proofs have been streamlined

    Journal ref: IEEE Transaction on Information Theory, 62(10):5855 - 5880, 2016

  26. arXiv:1401.3818  [pdf, ps, other

    cs.CV cs.LG stat.ML

    Structured Priors for Sparse-Representation-Based Hyperspectral Image Classification

    Authors: Xiaoxia Sun, Qing Qu, Nasser M. Nasrabadi, Trac D. Tran

    Abstract: Pixel-wise classification, where each pixel is assigned to a predefined class, is one of the most important procedures in hyperspectral image (HSI) analysis. By representing a test pixel as a linear combination of a small subset of labeled pixels, a sparse representation classifier (SRC) gives rather plausible results compared with that of traditional classifiers such as the support vector machine… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Comments: IEEE Geoscience and Remote Sensing Letter