Skip to main content

Showing 1–7 of 7 results for author: Vural, N M

.
  1. arXiv:2406.08658  [pdf, ps, other

    stat.ML cs.LG

    Pruning is Optimal for Learning Sparse Features in High-Dimensions

    Authors: Nuri Mert Vural, Murat A. Erdogdu

    Abstract: While it is commonly observed in practice that pruning networks to a certain level of sparsity can improve the quality of the features, a theoretical explanation of this phenomenon remains elusive. In this work, we investigate this by demonstrating that a broad class of statistical models can be optimally learned using pruned neural networks trained with gradient descent, in high-dimensions. We… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2024

  2. arXiv:2202.11632  [pdf, other

    stat.ML cs.LG math.OC

    Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance

    Authors: Nuri Mert Vural, Lu Yu, Krishnakumar Balasubramanian, Stanislav Volgushev, Murat A. Erdogdu

    Abstract: We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded $(1+κ)$-th moment, for some $κ\in (0,1]$, we quantify the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometri… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: 31 pages, 1 figure

  3. arXiv:2005.08948  [pdf, other

    cs.LG stat.ML

    Achieving Online Regression Performance of LSTMs with Simple RNNs

    Authors: N. Mert Vural, Fatih Ilhan, Selim F. Yilmaz, Salih Ergüt, Suleyman S. Kozat

    Abstract: Recurrent Neural Networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, Long-Short-Term-Memory Networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, traini… ▽ More

    Submitted 31 May, 2021; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2003.03601

  4. arXiv:2003.03601   

    cs.LG stat.ML

    RNN-based Online Learning: An Efficient First-Order Optimization Algorithm with a Convergence Guarantee

    Authors: N. Mert Vural, Selim F. Yilmaz, Fatih Ilhan, Suleyman S. Kozat

    Abstract: We investigate online nonlinear regression with continually running recurrent neural network networks (RNNs), i.e., RNN-based online learning. For RNN-based online learning, we introduce an efficient first-order training algorithm that theoretically guarantees to converge to the optimum network parameters. Our algorithm is truly online such that it does not make any assumption on the learning envi… ▽ More

    Submitted 31 May, 2021; v1 submitted 7 March, 2020; originally announced March 2020.

    Comments: This paper was an early draft of the presented results. We have written and published another paper (arXiv:2005.08948) where we have improved the material in this paper. The published paper covers most of the material presented in this paper as well. Therefore, we remove this paper from Arxiv and kindly refer the interested readers to arXiv:2005.08948

  5. arXiv:1911.12258   

    cs.LG eess.SP stat.ML

    Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning

    Authors: Nuri Mert Vural, Fatih Ilhan, Suleyman S. Kozat

    Abstract: We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations -- introduced due t… ▽ More

    Submitted 31 May, 2021; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: This paper was an early draft of the presented results. We have written and published another paper (arXiv:1911.12258) where we have improved on the material in this paper. The published paper covers most of the material presented in this paper as well. Therefore, we remove this paper from Arxiv and refer the interested readers to arXiv:1911.12258

  6. arXiv:1911.11122  [pdf, other

    cs.LG stat.ML

    Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays

    Authors: N. Mert Vural, Hakan Gokcesu, Kaan Gokcesu, Suleyman S. Kozat

    Abstract: We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal regret bounds. To construct our algorithm, we introduce a new expert advice algorithm for the multiple-play setting. By using our expert advice algorithm, we a… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

  7. arXiv:1910.09857  [pdf, other

    cs.LG eess.SP stat.ML

    An Efficient and Effective Second-Order Training Algorithm for LSTM-based Adaptive Learning

    Authors: N. Mert Vural, Salih Ergüt, Suleyman S. Kozat

    Abstract: We study adaptive (or online) nonlinear regression with Long-Short-Term-Memory (LSTM) based networks, i.e., LSTM-based adaptive learning. In this context, we introduce an efficient Extended Kalman filter (EKF) based second-order training algorithm. Our algorithm is truly online, i.e., it does not assume any underlying data generating process and future information, except that the target sequence… ▽ More

    Submitted 31 May, 2021; v1 submitted 22 October, 2019; originally announced October 2019.