Skip to main content

Showing 1–29 of 29 results for author: Tatikonda, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2203.08065  [pdf, other

    cs.LG cs.AI

    Surrogate Gap Minimization Improves Sharpness-Aware Training

    Authors: Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, Ting Liu

    Abstract: The recently proposed Sharpness-Aware Minimization (SAM) improves generalization by minimizing a \textit{perturbed loss} defined as the maximum loss within a neighborhood in the parameter space. However, we show that both sharp and flat minima can have a low perturbed loss, implying that SAM does not always prefer flat minima. Instead, we define a \textit{surrogate gap}, a measure equivalent to th… ▽ More

    Submitted 19 March, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: Paper accepted by ICLR22, https://openreview.net/forum?id=edONMAnhLu-

  2. arXiv:2110.05454  [pdf, other

    cs.LG math.OC

    Momentum Centering and Asynchronous Update for Adaptive Gradient Methods

    Authors: Juntang Zhuang, Yifan Ding, Tommy Tang, Nicha Dvornek, Sekhar Tatikonda, James S. Duncan

    Abstract: We propose ACProp (Asynchronous-centering-Prop), an adaptive optimizer which combines centering of second momentum and asynchronous update (e.g. for $t$-th update, denominator uses information up to step $t-1$, while numerator uses gradient at $t$-th step). ACProp has both strong theoretical properties and empirical performance. With the example by Reddi et al. (2018), we show that asynchronous op… ▽ More

    Submitted 1 December, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

  3. arXiv:2106.07696  [pdf, other

    cs.CV cs.AI

    Face Age Progression With Attribute Manipulation

    Authors: Sinzith Tatikonda, Athira Nambiar, Anurag Mittal

    Abstract: Face is one of the predominant means of person recognition. In the process of ageing, human face is prone to many factors such as time, attributes, weather and other subject specific variations. The impact of these factors were not well studied in the literature of face aging. In this paper, we propose a novel holistic model in this regard viz., ``Face Age progression With Attribute Manipulation (… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: -

  4. arXiv:2102.11013  [pdf, other

    q-bio.NC cs.LG

    Multiple-shooting adjoint method for whole-brain dynamic causal modeling

    Authors: Juntang Zhuang, Nicha Dvornek, Sekhar Tatikonda, Xenophon Papademetris, Pamela Ventola, James Duncan

    Abstract: Dynamic causal modeling (DCM) is a Bayesian framework to infer directed connections between compartments, and has been used to describe the interactions between underlying neural populations based on functional neuroimaging data. DCM is typically analyzed with the expectation-maximization (EM) algorithm. However, because the inversion of a large-scale continuous system is difficult when noisy obse… ▽ More

    Submitted 14 February, 2021; originally announced February 2021.

    Comments: 27th International Conference on Information Processing in Medical Imaging

  5. arXiv:2102.04668  [pdf, other

    cs.LG

    MALI: A memory efficient and reverse accurate integrator for Neural ODEs

    Authors: Juntang Zhuang, Nicha C. Dvornek, Sekhar Tatikonda, James S. Duncan

    Abstract: Neural ordinary differential equations (Neural ODEs) are a new family of deep-learning models with continuous depth. However, the numerical estimation of the gradient in the continuous case is not well solved: existing implementations of the adjoint method suffer from inaccuracy in reverse-time trajectory, while the naive method and the adaptive checkpoint adjoint method (ACA) have a memory cost t… ▽ More

    Submitted 3 March, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: https://openreview.net/forum?id=blfSjHeFM_e

    Journal ref: International Conference on Learning Representation, ICLR 2021

  6. arXiv:2010.07468  [pdf, other

    cs.LG cs.CV stat.ML

    AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients

    Authors: Juntang Zhuang, Tommy Tang, Yifan Ding, Sekhar Tatikonda, Nicha Dvornek, Xenophon Papademetris, James S. Duncan

    Abstract: Most popular optimizers for deep learning can be broadly categorized as adaptive methods (e.g. Adam) and accelerated schemes (e.g. stochastic gradient descent (SGD) with momentum). For many models such as convolutional neural networks (CNNs), adaptive methods typically converge faster but generalize worse compared to SGD; for complex settings such as generative adversarial networks (GANs), adaptiv… ▽ More

    Submitted 20 December, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

    Journal ref: NeurIPS 2020

  7. arXiv:2006.02493  [pdf

    stat.ML cs.LG

    Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE

    Authors: Juntang Zhuang, Nicha Dvornek, Xiaoxiao Li, Sekhar Tatikonda, Xenophon Papademetris, James Duncan

    Abstract: Neural ordinary differential equations (NODEs) have recently attracted increasing attention; however, their empirical performance on benchmark tasks (e.g. image classification) are significantly inferior to discrete-layer models. We demonstrate an explanation for their poorer performance is the inaccuracy of existing gradient estimation methods: the adjoint method has numerical errors in reverse-m… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

    Journal ref: https://proceedings.icml.cc/static/paper_files/icml/2020/917-Paper.pdf

  8. Sparse Regression Codes

    Authors: Ramji Venkataramanan, Sekhar Tatikonda, Andrew Barron

    Abstract: Develo** computationally-efficient codes that approach the Shannon-theoretic limits for communication and compression has long been one of the major goals of information and coding theory. There have been significant advances towards this goal in the last couple of decades, with the emergence of turbo codes, sparse-graph codes, and polar codes. These codes are designed primarily for discrete-alp… ▽ More

    Submitted 2 November, 2019; originally announced November 2019.

    Comments: Published in Foundations and Trends in Communications and Information Theory, 2019

    Journal ref: Foundations and Trends in Communications and Information Theory, vol. 15, no. 1-2, pp. 1-195, 2019

  9. arXiv:1808.09889  [pdf, other

    cs.CL cs.LG stat.ML

    Zero-shot Transfer Learning for Semantic Parsing

    Authors: Javid Dadashkarimi, Alexander Fabbri, Sekhar Tatikonda, Dragomir R. Radev

    Abstract: While neural networks have shown impressive performance on large datasets, applying these models to tasks where little data is available remains a challenging problem. In this paper we propose to use feature transfer in a zero-shot experimental setting on the task of semantic parsing. We first introduce a new method for learning the shared space between multiple domains based on the prediction… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

  10. arXiv:1807.07333  [pdf, other

    cs.LG stat.ML

    Sequence to Logic with Copy and Cache

    Authors: Javid Dadashkarimi, Sekhar Tatikonda

    Abstract: Generating logical form equivalents of human language is a fresh way to employ neural architectures where long short-term memory effectively captures dependencies in both encoder and decoder units. The logical form of the sequence usually preserves information from the natural language side in the form of similar tokens, and recently a copying mechanism has been proposed which increases the prob… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

  11. arXiv:1711.09853  [pdf, ps, other

    math.OC cs.IT

    The Time-Invariant Multidimensional Gaussian Sequential Rate-Distortion Problem Revisited

    Authors: Photios A. Stavrou, Takashi Tanaka, Sekhar Tatikonda

    Abstract: We revisit the sequential rate-distortion (SRD) trade-off problem for vector-valued Gauss-Markov sources with mean-squared error distortion constraints. We show via a counterexample that the dynamic reverse water-filling algorithm suggested by [1, eq. (15)] is not applicable to this problem, and consequently the closed form expression of the asymptotic SRD function derived in [1, eq. (17)] is not… ▽ More

    Submitted 27 November, 2017; originally announced November 2017.

    Comments: 7 pages, 2 figures

    MSC Class: 90C22; 94A15

  12. arXiv:1611.07138  [pdf, other

    math.OC cs.DS stat.ML

    A New Approach to Laplacian Solvers and Flow Problems

    Authors: Patrick Rebeschini, Sekhar Tatikonda

    Abstract: This paper investigates the behavior of the Min-Sum message passing scheme to solve systems of linear equations in the Laplacian matrices of graphs and to compute electric flows. Voltage and flow problems involve the minimization of quadratic functions and are fundamental primitives that arise in several domains. Algorithms that have been proposed are typically centralized and involve multiple gra… ▽ More

    Submitted 7 March, 2019; v1 submitted 21 November, 2016; originally announced November 2016.

  13. The Rate-Distortion Function and Excess-Distortion Exponent of Sparse Regression Codes with Optimal Encoding

    Authors: Ramji Venkataramanan, Sekhar Tatikonda

    Abstract: This paper studies the performance of sparse regression codes for lossy compression with the squared-error distortion criterion. In a sparse regression code, codewords are linear combinations of subsets of columns of a design matrix. It is shown that with minimum-distance encoding, sparse regression codes achieve the Shannon rate-distortion function for i.i.d. Gaussian sources $R^*(D)$ as well as… ▽ More

    Submitted 19 June, 2017; v1 submitted 21 January, 2014; originally announced January 2014.

    Comments: 16 pages. IEEE Transactions on Information Theory

    Journal ref: IEEE Transactions on Information Theory, Vol. 63, no. 8, pp. 5228-5243 (August 2017)

  14. arXiv:1301.0605  [pdf

    cs.AI

    Loopy Belief Propogation and Gibbs Measures

    Authors: Sekhar Tatikonda, Michael I. Jordan

    Abstract: We address the question of convergence in the loopy belief propagation (LBP) algorithm. Specifically, we relate convergence of LBP to the existence of a weak limit for a sequence of Gibbs measures defined on the LBP s associated computation tree.Using tools FROM the theory OF Gibbs measures we develop easily testable sufficient conditions FOR convergence.The failure OF convergence O… ▽ More

    Submitted 12 December, 2012; originally announced January 2013.

    Comments: Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002)

    Report number: UAI-P-2002-PG-493-500

  15. arXiv:1212.2125  [pdf, other

    cs.IT

    Sparse Regression Codes for Multi-terminal Source and Channel Coding

    Authors: Ramji Venkataramanan, Sekhar Tatikonda

    Abstract: We study a new class of codes for Gaussian multi-terminal source and channel coding. These codes are designed using the statistical framework of high-dimensional linear regression and are called Sparse Superposition or Sparse Regression codes. Codewords are linear combinations of subsets of columns of a design matrix. These codes were recently introduced by Barron and Joseph and shown to achieve t… ▽ More

    Submitted 10 December, 2012; originally announced December 2012.

    Comments: 9 pages, appeared in the Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing - 2012

  16. Lossy Compression via Sparse Linear Regression: Computationally Efficient Encoding and Decoding

    Authors: Ramji Venkataramanan, Tuhin Sarkar, Sekhar Tatikonda

    Abstract: We propose computationally efficient encoders and decoders for lossy compression using a Sparse Regression Code. The codebook is defined by a design matrix and codewords are structured linear combinations of columns of this matrix. The proposed encoding algorithm sequentially chooses columns of the design matrix to successively approximate the source sequence. It is shown to achieve the optimal di… ▽ More

    Submitted 28 March, 2014; v1 submitted 7 December, 2012; originally announced December 2012.

    Comments: 14 pages, to appear in IEEE Transactions on Information Theory

    Journal ref: IEEE Transactions on Information Theory, vol. 60, no. 6, pp. 3265-3278, June 2014

  17. arXiv:1212.0171  [pdf, ps, other

    cs.IT cs.LG stat.ML

    Message-Passing Algorithms for Quadratic Minimization

    Authors: Nicholas Ruozzi, Sekhar Tatikonda

    Abstract: Gaussian belief propagation (GaBP) is an iterative algorithm for computing the mean of a multivariate Gaussian distribution, or equivalently, the minimum of a multivariate positive definite quadratic function. Sufficient conditions, such as walk-summability, that guarantee the convergence and correctness of GaBP are known, but GaBP may fail to converge to the correct solution given an arbitrary po… ▽ More

    Submitted 1 December, 2012; originally announced December 2012.

    Journal ref: Journal of Machine Learning Research. 14 (Aug) :2287-2314, 2013

  18. arXiv:1211.4521  [pdf, ps, other

    cs.DB cs.DS cs.IR

    Hash in a Flash: Hash Tables for Solid State Devices

    Authors: Tyler Clemons, S. M. Faisal, Shirish Tatikonda, Charu Aggarawl, Srinivasan Parthasarathy

    Abstract: In recent years, information retrieval algorithms have taken center stage for extracting important data in ever larger datasets. Advances in hardware technology have lead to the increasingly wide spread use of flash storage devices. Such devices have clear benefits over traditional hard drives in terms of latency of access, bandwidth and random access capabilities particularly when reading data. T… ▽ More

    Submitted 19 November, 2012; originally announced November 2012.

    Comments: 16 pages 10 figures

    ACM Class: H.2.7; H.2.8; H.3.1; E.2

  19. Rewritable storage channels with hidden state

    Authors: Ramji Venkataramanan, Sekhar Tatikonda, Luis Lastras, Michele Franceschini

    Abstract: Many storage channels admit reading and rewriting of the content at a given cost. We consider rewritable channels with a hidden state which models the unknown characteristics of the memory cell. In addition to mitigating the effect of the write noise, rewrites can help the write controller obtain a better estimate of the hidden state. The paper has two contributions. The first is a lower bound on… ▽ More

    Submitted 3 June, 2013; v1 submitted 12 June, 2012; originally announced June 2012.

    Comments: 10 pages. Part of the paper appeared in the proceedings of the 2012 IEEE International Symposium on Information Theory

    Journal ref: IEEE Journal on Selected Areas in Communications, vol. 32, no. 5, pp. 815-824, May 2014

  20. Lossy Compression via Sparse Linear Regression: Performance under Minimum-distance Encoding

    Authors: Ramji Venkataramanan, Antony Joseph, Sekhar Tatikonda

    Abstract: We study a new class of codes for lossy compression with the squared-error distortion criterion, designed using the statistical framework of high-dimensional linear regression. Codewords are linear combinations of subsets of columns of a design matrix. Called a Sparse Superposition or Sparse Regression codebook, this structure is motivated by an analogous construction proposed recently by Barron a… ▽ More

    Submitted 18 December, 2015; v1 submitted 3 February, 2012; originally announced February 2012.

    Comments: This version corrects a typo in the statement of Theorem 2 of the published paper

    Journal ref: IEEE Transactions on Information Theory, vol. 60, no. 6, pp. 3254-3264, June 2014

  21. arXiv:1107.3818  [pdf, other

    cs.DM math.ST

    Conditioned Poisson distributions and the concentration of chromatic numbers

    Authors: John Hartigan, David Pollard, Sekhar Tatikonda

    Abstract: The paper provides a simpler method for proving a delicate inequality that was used by Achlioptis and Naor to establish asymptotic concentration for chromatic numbers of Erdos-Renyi random graphs. The simplifications come from two new ideas. The first involves a sharpened form of a piece of statistical folklore regarding goodness-of-fit tests for two-way tables of Poisson counts under linear condi… ▽ More

    Submitted 19 July, 2011; originally announced July 2011.

    Comments: Unpublished paper from June 2008

  22. Achievable Rates for Channels with Deletions and Insertions

    Authors: Ramji Venkataramanan, Sekhar Tatikonda, Kannan Ramchandran

    Abstract: This paper considers a binary channel with deletions and insertions, where each input bit is transformed in one of the following ways: it is deleted with probability d, or an extra bit is added after it with probability i, or it is transmitted unmodified with probability 1-d-i. A computable lower bound on the capacity of this channel is derived. The transformation of the input sequence by the chan… ▽ More

    Submitted 19 July, 2013; v1 submitted 24 February, 2011; originally announced February 2011.

    Comments: To appear in IEEE Transactions on Information Theory. For the deletion channel, the new capacity lower bound improves on the previous best bound for deletion probabilities up to 0.3

    Journal ref: IEEE Transactions on Information Theory, vol. 59, no.11, pp. 6990-7013, Nov. 2013

  23. Message-Passing Algorithms: Reparameterizations and Splittings

    Authors: Nicholas Ruozzi, Sekhar Tatikonda

    Abstract: The max-product algorithm, a local message-passing scheme that attempts to compute the most probable assignment (MAP) of a given probability distribution, has been successfully employed as a method of approximate inference for applications arising in coding theory, computer vision, and machine learning. However, the max-product algorithm is not guaranteed to converge to the MAP assignment, and if… ▽ More

    Submitted 1 December, 2012; v1 submitted 17 February, 2010; originally announced February 2010.

    Comments: A complete rework and expansion of the previous versions

    Journal ref: Information Theory, IEEE Transactions on , vol.59, no.9, pp.5860,5881, Sept. 2013

  24. arXiv:0911.2023  [pdf, other

    cs.IT

    Opportunistic capacity and error exponent regions for compound channel with feedback

    Authors: Aditya Mahajan, Sekhar Tatikonda

    Abstract: Variable length communication over a compound channel with feedback is considered. Traditionally, capacity of a compound channel without feedback is defined as the maximum rate that is determined before the start of communication such that communication is reliable. This traditional definition is pessimistic. In the presence of feedback, an opportunistic definition is given. Capacity is defined as… ▽ More

    Submitted 29 June, 2011; v1 submitted 10 November, 2009; originally announced November 2009.

  25. Network Tomography Based on Additive Metrics

    Authors: Jian Ni, Sekhar Tatikonda

    Abstract: Inference of the network structure (e.g., routing topology) and dynamics (e.g., link performance) is an essential component in many network design and management tasks. In this paper we propose a new, general framework for analyzing and designing routing topology and link performance inference algorithms using ideas and tools from phylogenetic inference in evolutionary biology. The framework is… ▽ More

    Submitted 31 August, 2008; originally announced September 2008.

    Comments: 35 pages

    Journal ref: IEEE Transactions on Information Theory, 57(12), December 2011

  26. Capacity-achieving Feedback Scheme for Gaussian Finite-State Markov Channels with Channel State Information

    Authors: Jialing Liu, Nicola Elia, Sekhar Tatikonda

    Abstract: In this paper, we propose capacity-achieving communication schemes for Gaussian finite-state Markov channels (FSMCs) subject to an average channel input power constraint, under the assumption that the transmitters can have access to delayed noiseless output feedback as well as instantaneous or delayed channel state information (CSI). We show that the proposed schemes reveals connections between fe… ▽ More

    Submitted 7 October, 2010; v1 submitted 14 August, 2008; originally announced August 2008.

    Comments: Submitted to the IEEE Transactions on Information Theory. 31 pages

  27. arXiv:0707.2014  [pdf, ps, other

    cs.IT

    On the error exponent of variable-length block-coding schemes over finite-state Markov channels with feedback

    Authors: Giacomo Como, Serdar Yuksel, Sekhar Tatikonda

    Abstract: The error exponent of Markov channels with feedback is studied in the variable-length block-coding setting. Burnashev's classic result is extended and a single letter characterization for the reliability function of finite-state Markov channels is presented, under the assumption that the channel state is causally observed both at the transmitter and at the receiver side. Tools from stochastic co… ▽ More

    Submitted 13 July, 2007; originally announced July 2007.

  28. arXiv:cs/0701099  [pdf, ps, other

    cs.IT

    On the Feedback Capacity of Power Constrained Gaussian Noise Channels with Memory

    Authors: Shaohua Yang, Aleksandar Kavcic, Sekhar Tatikonda

    Abstract: For a stationary additive Gaussian-noise channel with a rational noise power spectrum of a finite-order $L$, we derive two new results for the feedback capacity under an average channel input power constraint. First, we show that a very simple feedback-dependent Gauss-Markov source achieves the feedback capacity, and that Kalman-Bucy filtering is optimal for processing the feedback. Based on the… ▽ More

    Submitted 16 January, 2007; originally announced January 2007.

    Comments: Transaction on Information Theory, accepted version, first version submitted on Oct 22, 2003

  29. arXiv:cs/0609139  [pdf, ps, other

    cs.IT

    The Capacity of Channels with Feedback

    Authors: Sekhar Tatikonda, Sanjoy Mitter

    Abstract: We introduce a general framework for treating channels with memory and feedback. First, we generalize Massey's concept of directed information and use it to characterize the feedback capacity of general channels. Second, we present coding results for Markov channels. This requires determining appropriate sufficient statistics at the encoder and decoder. Third, a dynamic programming framework for… ▽ More

    Submitted 25 September, 2006; originally announced September 2006.