Skip to main content

Showing 1–21 of 21 results for author: Pandit, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.03311  [pdf, other

    stat.ML cs.LG

    On the Nystrom Approximation for Preconditioning in Kernel Machines

    Authors: Amirhesam Abedsoltan, Parthe Pandit, Luis Rademacher, Mikhail Belkin

    Abstract: Kernel methods are a popular class of nonlinear predictive models in machine learning. Scalable algorithms for learning kernel models need to be iterative in nature, but convergence can be slow due to poor conditioning. Spectral preconditioning is an important tool to speed-up the convergence of such iterative algorithms for training kernel models. However computing and storing a spectral precondi… ▽ More

    Submitted 24 January, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

  2. arXiv:2309.00570  [pdf, other

    stat.ML cs.CV cs.LG

    Mechanism of feature learning in convolutional neural networks

    Authors: Daniel Beaglehole, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

    Abstract: Understanding the mechanism of how convolutional neural networks learn features from image data is a fundamental problem in machine learning and computer vision. In this work, we identify such a mechanism. We posit the Convolutional Neural Feature Ansatz, which states that covariances of filters in any convolutional layer are proportional to the average gradient outer product (AGOP) taken with res… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  3. arXiv:2305.08277  [pdf, other

    cs.LG stat.ML

    Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks

    Authors: Evan Becker, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

    Abstract: Generative Adversarial Networks (GANs) are a popular formulation to train generative models for complex high dimensional data. The standard method for training GANs involves a gradient descent-ascent (GDA) procedure on a minimax optimization problem. This procedure is hard to analyze in general due to the nonlinear nature of the dynamics. We study the local dynamics of GDA for training a GAN with… ▽ More

    Submitted 29 May, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

  4. arXiv:2302.02605  [pdf, other

    cs.LG stat.ML

    Toward Large Kernel Models

    Authors: Amirhesam Abedsoltan, Mikhail Belkin, Parthe Pandit

    Abstract: Recent studies indicate that kernel machines can often perform similarly or better than deep neural networks (DNNs) on small datasets. The interest in kernel machines has been additionally bolstered by the discovery of their equivalence to wide neural networks in certain regimes. However, a key feature of DNNs is their ability to scale the model size and training data size independently, whereas i… ▽ More

    Submitted 19 June, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Code is available at github.com/EigenPro/EigenPro3

  5. arXiv:2212.13881  [pdf, other

    cs.LG cs.AI stat.ML

    Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features

    Authors: Adityanarayanan Radhakrishnan, Daniel Beaglehole, Parthe Pandit, Mikhail Belkin

    Abstract: In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in data, for prediction remains unclear. Identifying such a mechanism is key to advancing performance and interpretability of neural networks and promoting reliable adoption of these models in scientifi… ▽ More

    Submitted 9 May, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

  6. arXiv:2208.09938  [pdf, other

    cs.LG

    Instability and Local Minima in GAN Training with Kernel Discriminators

    Authors: Evan Becker, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

    Abstract: Generative Adversarial Networks (GANs) are a widely-used tool for generative modeling of complex data. Despite their empirical success, the training of GANs is not fully understood due to the min-max optimization of the generator and discriminator. This paper analyzes these joint dynamics when the true samples, as well as the generated samples, are discrete, finite sets, and the discriminator is k… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

  7. arXiv:2207.06569  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Benign, Tempered, or Catastrophic: A Taxonomy of Overfitting

    Authors: Neil Mallinar, James B. Simon, Amirhesam Abedsoltan, Parthe Pandit, Mikhail Belkin, Preetum Nakkiran

    Abstract: The practical success of overparameterized neural networks has motivated the recent scientific study of interpolating methods, which perfectly fit their training data. Certain interpolating methods, including neural networks, can fit noisy training data without catastrophically bad test performance, in defiance of standard intuitions from statistical learning theory. Aiming to explain this, a body… ▽ More

    Submitted 20 October, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: NM and JS co-first authors

  8. arXiv:2206.15058  [pdf, other

    cs.LG stat.ML

    A note on Linear Bottleneck networks and their Transition to Multilinearity

    Authors: Libin Zhu, Parthe Pandit, Mikhail Belkin

    Abstract: Randomly initialized wide neural networks transition to linear functions of weights as the width grows, in a ball of radius $O(1)$ around initialization. A necessary condition for this result is that all layers of the network are wide enough, i.e., all widths tend to infinity. However, the transition to linearity breaks down when this infinite width assumption is violated. In this work we show tha… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  9. arXiv:2205.13525  [pdf, other

    cs.LG

    On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions

    Authors: Daniel Beaglehole, Mikhail Belkin, Parthe Pandit

    Abstract: ``Benign overfitting'', the ability of certain algorithms to interpolate noisy training data and yet perform well out-of-sample, has been a topic of considerable recent interest. We show, using a fixed design setup, that an important class of predictors, kernel machines with translation-invariant kernels, does not exhibit benign overfitting in fixed dimensions. In particular, the estimated predict… ▽ More

    Submitted 12 April, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

  10. arXiv:2201.08082  [pdf, other

    stat.ML cs.LG

    Kernel Methods and Multi-layer Perceptrons Learn Linear Models in High Dimensions

    Authors: Mojtaba Sahraee-Ardakan, Melikasadat Emami, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

    Abstract: Empirical observation of high dimensional phenomena, such as the double descent behaviour, has attracted a lot of interest in understanding classical techniques such as kernel methods, and their implications to explain generalization properties of neural networks. Many recent works analyze such models in a certain high-dimensional regime where the covariates are independent and the number of sampl… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

  11. arXiv:2108.03760  [pdf

    cs.AI

    Symptom based Hierarchical Classification of Diabetes and Thyroid disorders using Fuzzy Cognitive Maps

    Authors: Anand M. Shukla, Pooja D. Pandit, Vasudev M. Purandare, Anuradha Srinivasaraghavan

    Abstract: Fuzzy Cognitive Maps (FCMs) are soft computing technique that follows an approach similar to human reasoning and human decision-making process, making them a valuable modeling and simulation methodology. Medical Decision Systems are complex systems consisting of many factors that may be complementary, contradictory, and competitive; these factors influence each other and determine the overall diag… ▽ More

    Submitted 8 August, 2021; originally announced August 2021.

  12. arXiv:2101.07833  [pdf, ps, other

    cs.LG cs.NE eess.SY stat.ML

    Implicit Bias of Linear RNNs

    Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

    Abstract: Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, precise reasoning for this behavior is still unknown. This paper provides a rigorous explanation of this property in the special case of linear RNNs. Although this work is limited to linear RNNs, even these systems have traditional… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

    Comments: 30 pages, 4 figures

  13. arXiv:2005.05053  [pdf, other

    q-bio.NC cs.LG cs.NE eess.SP stat.ML

    Low-Rank Nonlinear Decoding of $μ$-ECoG from the Primary Auditory Cortex

    Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Alyson K. Fletcher, Sundeep Rangan, Michael Trumpis, Brinnae Bent, Chia-Han Chiang, Jonathan Viventi

    Abstract: This paper considers the problem of neural decoding from parallel neural measurements systems such as micro-electrocorticography ($μ$-ECoG). In systems with large numbers of array elements at very high sampling rates, the dimension of the raw measurement data may be large. Learning neural decoders for this high-dimensional data can be challenging, particularly when the number of training samples i… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

    Comments: 4 pages, 3 figures

  14. arXiv:2005.00180  [pdf, other

    cs.LG stat.ML

    Generalization Error of Generalized Linear Models in High Dimensions

    Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Sundeep Rangan, Alyson K. Fletcher

    Abstract: At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our understanding of their generalization capabilities is incomplete. This task is made harder by the non-convexity of the underlying learning problems. We provide a general… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

    Comments: 20 pages, 4 figures

  15. arXiv:2001.09396  [pdf, other

    cs.LG cs.IT cs.NE eess.SP stat.ML

    Inference in Multi-Layer Networks with Matrix-Valued Unknowns

    Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter, Alyson K. Fletcher

    Abstract: We consider the problem of inferring the input and hidden variables of a stochastic multi-layer neural network from an observation of the output. The hidden variables in each layer are represented as matrices. This problem applies to signal recovery via deep generative prior models, multi-task and mixed regression and learning certain classes of two-layer neural networks. A unified approximation a… ▽ More

    Submitted 25 January, 2020; originally announced January 2020.

    Comments: 3 figures, 6 pages (two-column) + Appendix. arXiv admin note: text overlap with arXiv:1911.03409

  16. arXiv:1911.03409  [pdf, other

    cs.LG cs.IT cs.NE eess.SP stat.ML

    Inference with Deep Generative Priors in High Dimensions

    Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter, Alyson K. Fletcher

    Abstract: Deep generative priors offer powerful models for complex-structured data, such as images, audio, and text. Using these priors in inverse problems typically requires estimating the input and/or hidden signals in a multi-layer deep neural network from observation of its output. While these approaches have been successful in practice, rigorous performance analysis is complicated by the non-convex nat… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: 50 pages, double-spaced

  17. arXiv:1903.09631  [pdf, other

    math.ST cs.LG eess.SP stat.ML

    High-Dimensional Bernoulli Autoregressive Process with Long-Range Dependence

    Authors: Parthe Pandit, Mojtaba Sahraee-Ardakan, Arash A. Amini, Sundeep Rangan, Alyson K. Fletcher

    Abstract: We consider the problem of estimating the parameters of a multivariate Bernoulli process with auto-regressive feedback in the high-dimensional setting where the number of samples available is much less than the number of parameters. This problem arises in learning interconnections of networks of dynamical systems with spiking or binary-valued data. We allow the process to depend on its past up to… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

    Comments: To appear at AISTATS 2019 titled "Sparse Multivariate Bernoulli Processes in High Dimensions"

    Journal ref: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS) 2019, Naha, Okinawa, Japan. PMLR: Volume 89

  18. arXiv:1903.01293  [pdf, other

    cs.IT stat.ML

    Asymptotics of MAP Inference in Deep Networks

    Authors: Parthe Pandit, Mojtaba Sahraee, Sundeep Rangan, Alyson K. Fletcher

    Abstract: Deep generative priors are a powerful tool for reconstruction problems with complex data such as images and text. Inverse problems using such models require solving an inference problem of estimating the input and hidden units of the multi-layer network from its output. Maximum a priori (MAP) estimation is a widely-used inference method as it is straightforward to implement, and has been successfu… ▽ More

    Submitted 1 March, 2019; originally announced March 2019.

    Comments: 11 pages. arXiv admin note: text overlap with arXiv:1706.06549

  19. arXiv:1608.06627  [pdf

    physics.med-ph cs.CV cs.NE

    Artificial Neural Networks for Detection of Malaria in RBCs

    Authors: Purnima Pandit, A. Anand

    Abstract: Malaria is one of the most common diseases caused by mosquitoes and is a great public health problem worldwide. Currently, for malaria diagnosis the standard technique is microscopic examination of a stained blood film. We propose use of Artificial Neural Networks (ANN) for the diagnosis of the disease in the red blood cell. For this purpose features / parameters are computed from the data obtaine… ▽ More

    Submitted 23 August, 2016; originally announced August 2016.

    MSC Class: 62M45

  20. Refinement of the Equilibrium of Public Goods Games over Networks: Efficiency and Effort of Specialized Equilibria

    Authors: Parthe Pandit, Ankur A. Kulkarni

    Abstract: Recently Bramoulle and Kranton presented a model for the provision of public goods over a network and showed the existence of a class of Nash equilibria called specialized equilibria wherein some agents exert maximum effort while other agents free ride. We examine the efficiency, effort and cost of specialized equilibria in comparison to other equilibria. Our main results show that the welfare of… ▽ More

    Submitted 23 January, 2022; v1 submitted 7 July, 2016; originally announced July 2016.

    MSC Class: 91A43; 05C57; 91D30; 90C35

    Journal ref: Journal of Mathematical Economics, Available online 16 April 2018

  21. arXiv:1603.05075  [pdf, ps, other

    cs.DM cs.CC math.CO math.OC

    A linear complementarity based characterization of the weighted independence number and the independent domination number in graphs

    Authors: Parthe Pandit, Ankur A. Kulkarni

    Abstract: The linear complementarity problem is a continuous optimization problem that generalizes convex quadratic programming, Nash equilibria of bimatrix games and several such problems. This paper presents a continuous optimization formulation for the weighted independence number of a graph by characterizing it as the maximum weighted $\ell_1$ norm over the solution set of a linear complementarity probl… ▽ More

    Submitted 16 March, 2016; originally announced March 2016.

    Comments: 16 pages

    MSC Class: 05C69; 68R10; 90C33; 90C27; 90C26

    Journal ref: Discrete Applied Mathematics, Volume 244, 31 July 2018, Pages 155-169