Skip to main content

Showing 1–5 of 5 results for author: Yaras, C

.
  1. arXiv:2406.04112  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

    Authors: Can Yaras, Peng Wang, Laura Balzano, Qing Qu

    Abstract: While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of overparameterization without the… ▽ More

    Submitted 9 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML'24 (Oral)

  2. arXiv:2311.02960  [pdf, other

    cs.LG cs.CV math.OC

    Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

    Authors: Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

    Abstract: Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an open question how deep networks perform hierarchical feature learning across layers. In this work, we attempt to unveil this mystery by investigating the structures of intermediate features. Motivated by our empirical findings that linear layers mimic… ▽ More

    Submitted 9 January, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: 61 pages, 14 figures

  3. arXiv:2306.01154  [pdf, other

    cs.LG

    The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

    Authors: Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

    Abstract: Over the past few years, an extensively studied phenomenon in training deep networks is the implicit bias of gradient descent towards parsimonious solutions. In this work, we investigate this phenomenon by narrowing our focus to deep linear networks. Through our analysis, we reveal a surprising "law of parsimony" in the learning dynamics when the data possesses low-dimensional structures. Specific… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: The first two authors contributed to this work equally; 32 pages, 12 figures

  4. arXiv:2209.09211  [pdf, other

    cs.LG cs.CV cs.IT eess.SP stat.ML

    Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

    Authors: Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

    Abstract: When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon. More specifically, for the output features of the penultimate layer, for each class the within-class features converge to their means, and the means of different classes exhibit a certain tight frame structure, which is also… ▽ More

    Submitted 7 March, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: The first two authors contributed to this work equally; 38 pages, 13 figures. Accepted at NeurIPS'22

  5. arXiv:2104.14032  [pdf, other

    cs.CV

    Randomized Histogram Matching: A Simple Augmentation for Unsupervised Domain Adaptation in Overhead Imagery

    Authors: Can Yaras, Kaleb Kassaw, Bohao Huang, Kyle Bradbury, Jordan M. Malof

    Abstract: Modern deep neural networks (DNNs) are highly accurate on many recognition tasks for overhead (e.g., satellite) imagery. However, visual domain shifts (e.g., statistical changes due to geography, sensor, or atmospheric conditions) remain a challenge, causing the accuracy of DNNs to degrade substantially and unpredictably when testing on new sets of imagery. In this work, we model domain shifts cau… ▽ More

    Submitted 11 August, 2023; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: Includes a main paper (10 pages). This paper is currently undergoing peer review