Skip to main content

Showing 1–8 of 8 results for author: Devarakonda, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18001  [pdf, other

    cs.DC stat.ML

    Scalable Dual Coordinate Descent for Kernel Methods

    Authors: Zishan Shao, Aditya Devarakonda

    Abstract: Dual Coordinate Descent (DCD) and Block Dual Coordinate Descent (BDCD) are important iterative methods for solving convex optimization problems. In this work, we develop scalable DCD and BDCD methods for the kernel support vector machines (K-SVM) and kernel ridge regression (K-RR) problems. On distributed-memory parallel machines the scalability of these methods is limited by the need to communica… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    MSC Class: 65Y05 ACM Class: D.1.3; G.4; F.2.1

  2. arXiv:2307.16652  [pdf, other

    cs.DC cs.LG stat.ML

    Sequential and Shared-Memory Parallel Algorithms for Partitioned Local Depths

    Authors: Aditya Devarakonda, Grey Ballard

    Abstract: In this work, we design, analyze, and optimize sequential and shared-memory parallel algorithms for partitioned local depths (PaLD). Given a set of data points and pairwise distances, PaLD is a method for identifying strength of pairwise relationships based on relative distances, enabling the identification of strong ties within dense and sparse communities even if their sizes and within-community… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    MSC Class: 68W10 ACM Class: D.1.3

  3. arXiv:2011.08281  [pdf, other

    cs.LG cs.DC

    Avoiding Communication in Logistic Regression

    Authors: Aditya Devarakonda, James Demmel

    Abstract: Stochastic gradient descent (SGD) is one of the most widely used optimization methods for solving various machine learning problems. SGD solves an optimization problem by iteratively sampling a few data points from the input data, computing gradients for the selected data points, and updating the solution. However, in a parallel setting, SGD requires interprocess communication at every iteration.… ▽ More

    Submitted 16 November, 2020; originally announced November 2020.

  4. arXiv:1712.06047  [pdf, other

    cs.DC cs.LG math.OC stat.ML

    Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization

    Authors: Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney

    Abstract: Parallel computing has played an important role in speeding up convex optimization methods for big data analytics and large-scale machine learning (ML). However, the scalability of these optimization methods is inhibited by the cost of communicating and synchronizing processors in a parallel setting. Iterative ML methods are particularly sensitive to communication cost since they often require com… ▽ More

    Submitted 16 December, 2017; originally announced December 2017.

    MSC Class: 68W10; 90C25 ACM Class: G.1.6

  5. arXiv:1712.02029  [pdf, other

    cs.LG cs.CV cs.DC stat.ML

    AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks

    Authors: Aditya Devarakonda, Maxim Naumov, Michael Garland

    Abstract: Training deep neural networks with Stochastic Gradient Descent, or its variants, requires careful choice of both learning rate and batch size. While smaller batch sizes generally converge in fewer training epochs, larger batch sizes offer more parallelism and hence better computational efficiency. We have developed a new training approach that, rather than statically choosing a single batch size f… ▽ More

    Submitted 13 February, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

    Comments: 14 pages

    MSC Class: 68T05; ACM Class: I.2.6; I.5.0

  6. arXiv:1710.08883  [pdf, other

    cs.DC cs.LG math.NA math.OC

    Avoiding Communication in Proximal Methods for Convex Optimization Problems

    Authors: Saeed Soori, Aditya Devarakonda, James Demmel, Mert Gurbuzbalaban, Maryam Mehri Dehnavi

    Abstract: The fast iterative soft thresholding algorithm (FISTA) is used to solve convex regularized optimization problems in machine learning. Distributed implementations of the algorithm have become popular since they enable the analysis of large datasets. However, existing formulations of FISTA communicate data at every iteration which reduces its performance on modern distributed architectures. The comm… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

  7. arXiv:1612.04003  [pdf, other

    cs.DC

    Avoiding communication in primal and dual block coordinate descent methods

    Authors: Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney

    Abstract: Primal and dual block coordinate descent methods are iterative methods for solving regularized and unregularized optimization problems. Distributed-memory parallel implementations of these methods have become popular in analyzing large machine learning datasets. However, existing implementations communicate at every iteration which, on modern data center and supercomputing architectures, often dom… ▽ More

    Submitted 1 May, 2017; v1 submitted 12 December, 2016; originally announced December 2016.

    MSC Class: 68W10; 65F10 ACM Class: G.1.0; G.1.3; G.1.6

  8. arXiv:1607.01335  [pdf, other

    cs.DC

    Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

    Authors: Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, Pramod Sharma, Jiyan Yang, James Demmel, Jim Harrell, Venkat Krishnamurthy, Michael W. Mahoney, Prabhat

    Abstract: We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausability), PCA (for its ubiquity… ▽ More

    Submitted 20 September, 2016; v1 submitted 5 July, 2016; originally announced July 2016.

    ACM Class: G.1.3; C.2.4