Skip to main content

Showing 1–50 of 105 results for author: Moitra, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11686  [pdf, ps, other

    cs.LG cs.AI stat.ML

    The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation

    Authors: Noah Golowich, Ankur Moitra

    Abstract: In this paper, we study the offline RL problem with linear function approximation. Our main structural assumption is that the MDP has low inherent Bellman error, which stipulates that linear value functions have linear Bellman backups with respect to the greedy policy. This assumption is natural in that it is essentially the minimal assumption required for value iteration to succeed. We give a com… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: RLC 2024

  2. arXiv:2406.11640  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions

    Authors: Noah Golowich, Ankur Moitra

    Abstract: One of the most natural approaches to reinforcement learning (RL) with function approximation is value iteration, which inductively generates approximations to the optimal value function by solving a sequence of regression problems. To ensure the success of value iteration, it is typically assumed that Bellman completeness holds, which ensures that these regression problems are well-specified. We… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: COLT 2024

  3. arXiv:2406.02633  [pdf, ps, other

    cs.CR cs.AI cs.LG

    Edit Distance Robust Watermarks for Language Models

    Authors: Noah Golowich, Ankur Moitra

    Abstract: Motivated by the problem of detecting AI-generated text, we consider the problem of watermarking the output of language models with provable guarantees. We aim for watermarks which satisfy: (a) undetectability, a cryptographic notion introduced by Christ, Gunn & Zamir (2024) which stipulates that it is computationally hard to distinguish watermarked language model outputs from the model's actual o… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2405.00082  [pdf, other

    quant-ph cs.DS cs.LG

    Structure learning of Hamiltonians from real-time evolution

    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Ewin Tang

    Abstract: We initiate the study of Hamiltonian structure learning from real-time evolution: given the ability to apply $e^{-\mathrm{i} Ht}$ for an unknown local Hamiltonian $H = \sum_{a = 1}^m λ_a E_a$ on $n$ qubits, the goal is to recover $H$. This problem is already well-studied under the assumption that the interaction terms, $E_a$, are given, and only the interaction strengths, $λ_a$, are unknown. But i… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 50 pages

  5. PIVOT- Input-aware Path Selection for Energy-efficient ViT Inference

    Authors: Abhishek Moitra, Abhiroop Bhattacharjee, Priyadarshini Panda

    Abstract: The attention module in vision transformers(ViTs) performs intricate spatial correlations, contributing significantly to accuracy and delay. It is thereby important to modulate the number of attentions according to the input feature complexity for optimal delay-accuracy tradeoffs. To this end, we propose PIVOT - a co-optimization framework which selectively performs attention skip** based on the… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted to 61st ACM/IEEE Design Automation Conference (DAC '24), June 23--27, 2024, San Francisco, CA, USA (6 Pages)

  6. arXiv:2404.11325  [pdf, ps, other

    cs.CR cs.DS

    On Learning Parities with Dependent Noise

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: In this expository note we show that the learning parities with noise (LPN) assumption is robust to weak dependencies in the noise distribution of small batches of samples. This provides a partial converse to the linearization technique of [AG11]. The material in this note is drawn from a recent work by the authors [GMR24], where the robustness guarantee was a key component in a cryptographic sepa… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: This note draws heavily from arXiv:2404.03774

  7. arXiv:2404.03774  [pdf, other

    cs.LG cs.CC cs.CR cs.DS

    Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: Supervised learning is often computationally easy in practice. But to what extent does this mean that other modes of learning, such as reinforcement learning (RL), ought to be computationally easy by extension? In this work we show the first cryptographic separation between RL and supervised learning, by exhibiting a class of block MDPs and associated decoding functions where reward-free explorati… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 112 pages, 3 figures

  8. arXiv:2403.16850  [pdf, other

    quant-ph cs.DS math-ph

    High-Temperature Gibbs States are Unentangled and Efficiently Preparable

    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Ewin Tang

    Abstract: We show that thermal states of local Hamiltonians are separable above a constant temperature. Specifically, for a local Hamiltonian $H$ on a graph with degree $\mathfrak{d}$, its Gibbs state at inverse temperature $β$, denoted by $ρ=e^{-βH}/ \textrm{tr}(e^{-βH})$, is a classical distribution over product states for all $β< 1/(c\mathfrak{d})$, where $c$ is a constant. This sudden death of thermal e… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  9. arXiv:2402.02586  [pdf, other

    cs.LG cs.ET

    ClipFormer: Key-Value Clip** of Transformers on Memristive Crossbars for Write Noise Mitigation

    Authors: Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

    Abstract: Transformers have revolutionized various real-world applications from natural language processing to computer vision. However, traditional von-Neumann computing paradigm faces memory and bandwidth limitations in accelerating transformers owing to their massive model sizes. To this end, In-memory Computing (IMC) crossbars based on Non-volatile Memories (NVMs), due to their ability to perform highly… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 9 pages, 10 figures, 3 tables, 1 appendix

  10. arXiv:2401.08001  [pdf, other

    cs.NE

    TT-SNN: Tensor Train Decomposition for Efficient Spiking Neural Network Training

    Authors: Donghyun Lee, Ruokai Yin, Youngeun Kim, Abhishek Moitra, Yuhang Li, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have gained significant attention as a potentially energy-efficient alternative for standard neural networks with their sparse binary activation. However, SNNs suffer from memory and computation overhead due to spatio-temporal dynamics and multiple backpropagation computations across timesteps during training. To address this issue, we introduce Tensor Train Decompos… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  11. arXiv:2312.03559  [pdf, other

    cs.AR

    MCAIMem: a Mixed SRAM and eDRAM Cell for Area and Energy-efficient on-chip AI Memory

    Authors: Duy-Thanh Nguyen, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

    Abstract: AI chips commonly employ SRAM memory as buffers for their reliability and speed, which contribute to high performance. However, SRAM is expensive and demands significant area and energy consumption. Previous studies have explored replacing SRAM with emerging technologies like non-volatile memory, which offers fast-read memory access and a small cell area. Despite these advantages, non-volatile mem… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  12. arXiv:2310.06845  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    RobustEdge: Low Power Adversarial Detection for Cloud-Edge Systems

    Authors: Abhishek Moitra, Abhiroop Bhattacharjee, Youngeun Kim, Priyadarshini Panda

    Abstract: In practical cloud-edge scenarios, where a resource constrained edge performs data acquisition and a cloud system (having sufficient resources) performs inference tasks with a deep neural network (DNN), adversarial robustness is critical for reliability and ubiquitous deployment. Adversarial detection is a prime adversarial defence technique used in prior literature. However, in prior detection wo… ▽ More

    Submitted 5 September, 2023; originally announced October 2023.

    Comments: IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI)

  13. arXiv:2310.02243  [pdf, other

    quant-ph cs.DS cs.LG

    Learning quantum Hamiltonians at any temperature in polynomial time

    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Ewin Tang

    Abstract: We study the problem of learning a local quantum Hamiltonian $H$ given copies of its Gibbs state $ρ= e^{-βH}/\textrm{tr}(e^{-βH})$ at a known inverse temperature $β>0$. Anshu, Arunachalam, Kuwahara, and Soleimanifar (arXiv:2004.07266) gave an algorithm to learn a Hamiltonian on $n$ qubits to precision $ε$ with only polynomially many copies of the Gibbs state, but which takes exponential time. Obta… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  14. arXiv:2309.09457  [pdf, ps, other

    cs.LG cs.AI cs.DS math.OC stat.ML

    Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $φ(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are linear functions in this representation. But where do these features come from? In the absence of expert domain knowledge, a tempting strategy is to use the ``kitchen s… ▽ More

    Submitted 18 September, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

  15. arXiv:2309.03388  [pdf, other

    cs.NE

    Are SNNs Truly Energy-efficient? $-$ A Hardware Perspective

    Authors: Abhiroop Bhattacharjee, Ruokai Yin, Abhishek Moitra, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have gained attention for their energy-efficient machine learning capabilities, utilizing bio-inspired activation functions and sparse binary spike-data representations. While recent SNN algorithmic advances achieve high accuracy on large-scale computer vision tasks, their energy-efficiency claims rely on certain impractical estimation metrics. This work studies two… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: 5 pages

  16. HyDe: A Hybrid PCM/FeFET/SRAM Device-search for Optimizing Area and Energy-efficiencies in Analog IMC Platforms

    Authors: Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

    Abstract: Today, there are a plethora of In-Memory Computing (IMC) devices- SRAMs, PCMs & FeFETs, that emulate convolutions on crossbar-arrays with high throughput. Each IMC device offers its own pros & cons during inference of Deep Neural Networks (DNNs) on crossbars in terms of area overhead, programming energy and non-idealities. A design-space exploration is, therefore, imperative to derive a hybrid-dev… ▽ More

    Submitted 24 October, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted to IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS)

    Journal ref: IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS), 2023

  17. arXiv:2307.06538  [pdf, ps, other

    cs.LG cs.DS math.OC stat.ML

    Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems

    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Morris Yau

    Abstract: Recently Chen and Poor initiated the study of learning mixtures of linear dynamical systems. While linear dynamical systems already have wide-ranging applications in modeling time-series data, using mixture models can lead to a better fit or even a richer understanding of underlying subpopulations represented in the data. In this work we give a new approach to learning mixtures of linear dynamical… ▽ More

    Submitted 23 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: ICML 2023

  18. arXiv:2306.01993  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Provable benefits of score matching

    Authors: Chirag Pabbaraju, Dhruv Rohatgi, Anish Sevekari, Holden Lee, Ankur Moitra, Andrej Risteski

    Abstract: Score matching is an alternative to maximum likelihood (ML) for estimating a probability distribution parametrized up to a constant of proportionality. By fitting the ''score'' of the distribution, it sidesteps the need to compute this constant of proportionality (which is often intractable). While score matching and variants thereof are popular in practice, precise theoretical understanding of th… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 25 Pages

  19. Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing

    Authors: Abhiroop Bhattacharjee, Abhishek Moitra, Youngeun Kim, Yeshwanth Venkatesha, Priyadarshini Panda

    Abstract: In-Memory Computing (IMC) platforms such as analog crossbars are gaining focus as they facilitate the acceleration of low-precision Deep Neural Networks (DNNs) with high area- & compute-efficiencies. However, the intrinsic non-idealities in crossbars, which are often non-deterministic and non-linear, degrade the performance of the deployed DNNs. In addition to quantization errors, most frequently… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted in Great Lakes Symposium on VLSI 2023 (GLSVLSI 2023) conference

    Journal ref: Great Lakes Symposium on VLSI 2023 (GLSVLSI 2023) conference

  20. arXiv:2305.18360  [pdf, other

    cs.NE

    Sharing Leaky-Integrate-and-Fire Neurons for Memory-Efficient Spiking Neural Networks

    Authors: Youngeun Kim, Yuhang Li, Abhishek Moitra, Ruokai Yin, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have gained increasing attention as energy-efficient neural networks owing to their binary and asynchronous computation. However, their non-linear activation, that is Leaky-Integrate-and-Fire (LIF) neuron, requires additional memory to store a membrane voltage to capture the temporal dynamics of spikes. Although the required memory cost for LIF neurons significantly… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  21. arXiv:2305.17346  [pdf, other

    cs.NE cs.LG

    Input-Aware Dynamic Timestep Spiking Neural Networks for Efficient In-Memory Computing

    Authors: Yuhang Li, Abhishek Moitra, Tamar Geller, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have recently attracted widespread research interest as an efficient alternative to traditional Artificial Neural Networks (ANNs) because of their capability to process sparse and binary spike information and avoid expensive multiplication operations. Although the efficiency of SNNs can be realized on the In-Memory Computing (IMC) architecture, we show that the energ… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Published at Design & Automation Conferences (DAC) 2023

  22. arXiv:2305.17223  [pdf, other

    cs.CV cs.AI

    Do We Really Need a Large Number of Visual Prompts?

    Authors: Youngeun Kim, Yuhang Li, Abhishek Moitra, Ruokai Yin, Priyadarshini Panda

    Abstract: Due to increasing interest in adapting models on resource-constrained edges, parameter-efficient transfer learning has been widely explored. Among various methods, Visual Prompt Tuning (VPT), prepending learnable prompts to input space, shows competitive fine-tuning performance compared to training of full network parameters. However, VPT increases the number of input tokens, resulting in addition… ▽ More

    Submitted 12 May, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

  23. arXiv:2305.09850  [pdf, other

    cs.NE

    MINT: Multiplier-less INTeger Quantization for Energy Efficient Spiking Neural Networks

    Authors: Ruokai Yin, Yuhang Li, Abhishek Moitra, Priyadarshini Panda

    Abstract: We propose Multiplier-less INTeger (MINT) quantization, a uniform quantization scheme that efficiently compresses weights and membrane potentials in spiking neural networks (SNNs). Unlike previous SNN quantization methods, MINT quantizes memory-intensive membrane potentials to an extremely low precision (2-bit), significantly reducing the memory footprint. MINT also shares the quantization scaling… ▽ More

    Submitted 7 November, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: 6 pages. Accepted to 29th Asia and South Pacific Design Automation Conference (ASP-DAC 2024), nominated for best paper award

  24. arXiv:2304.01954  [pdf, ps, other

    cs.DS cs.DM math.CO math.PR

    Strong spatial mixing for colorings on trees and its algorithmic applications

    Authors: Zongchen Chen, Kuikui Liu, Nitya Mani, Ankur Moitra

    Abstract: Strong spatial mixing (SSM) is an important quantitative notion of correlation decay for Gibbs distributions arising in statistical physics, probability theory, and theoretical computer science. A longstanding conjecture is that the uniform distribution on proper $q$-colorings on a $Δ$-regular tree exhibits SSM whenever $q \ge Δ+1$. Moreover, it is widely believed that as long as SSM holds on boun… ▽ More

    Submitted 13 February, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: 54 pages, 3 page appendix

  25. arXiv:2303.17646  [pdf, other

    cs.CV

    XPert: Peripheral Circuit & Neural Architecture Co-search for Area and Energy-efficient Xbar-based Computing

    Authors: Abhishek Moitra, Abhiroop Bhattacharjee, Youngeun Kim, Priyadarshini Panda

    Abstract: The hardware-efficiency and accuracy of Deep Neural Networks (DNNs) implemented on In-memory Computing (IMC) architectures primarily depend on the DNN architecture and the peripheral circuit parameters. It is therefore essential to holistically co-search the network and peripheral parameters to achieve optimal performance. To this end, we propose XPert, which co-searches network architecture in ta… ▽ More

    Submitted 21 November, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted to Design and Automation Conference (DAC)

    Journal ref: 60th DAC, 2023

  26. arXiv:2302.07769  [pdf, other

    cs.LG cs.ET

    XploreNAS: Explore Adversarially Robust & Hardware-efficient Neural Architectures for Non-ideal Xbars

    Authors: Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

    Abstract: Compute In-Memory platforms such as memristive crossbars are gaining focus as they facilitate acceleration of Deep Neural Networks (DNNs) with high area and compute-efficiencies. However, the intrinsic non-idealities associated with the analog nature of computing in crossbars limits the performance of the deployed DNNs. Furthermore, DNNs are shown to be vulnerable to adversarial attacks leading to… ▽ More

    Submitted 15 April, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: Accepted to ACM Transactions on Embedded Computing Systems in April 2023

    Journal ref: ACM Transactions on Embedded Computing Systems (2023)

  27. arXiv:2302.06746  [pdf, other

    cs.NE

    Workload-Balanced Pruning for Sparse Spiking Neural Networks

    Authors: Ruokai Yin, Youngeun Kim, Yuhang Li, Abhishek Moitra, Nitin Satpute, Anna Hambitzer, Priyadarshini Panda

    Abstract: Pruning for Spiking Neural Networks (SNNs) has emerged as a fundamental methodology for deploying deep SNNs on resource-constrained edge devices. Though the existing pruning methods can provide extremely high weight sparsity for deep SNNs, the high weight sparsity brings a workload imbalance problem. Specifically, the workload imbalance happens when a different number of non-zero weights are assig… ▽ More

    Submitted 22 March, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: 11 pages. Accepted to IEEE Transactions on Emerging Topics in Computational Intelligence (2024)

  28. arXiv:2302.04712  [pdf, other

    cs.LG cs.ET

    DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks

    Authors: Duy-Thanh Nguyen, Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

    Abstract: With ever increasing depth and width in deep neural networks to achieve state-of-the-art performance, deep learning computation has significantly grown, and dot-products remain dominant in overall computation time. Most prior works are built on conventional dot-product where weighted input summation is used to represent the neuron operation. However, another implementation of dot-product based on… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted to Design, Automation and Test in Europe (DATE) Conference, 2023

    Journal ref: Design, Automation and Test in Europe (DATE) Conference, 2023

  29. arXiv:2301.09519  [pdf, ps, other

    math.OC cs.DS cs.LG stat.ML

    A New Approach to Learning Linear Dynamical Systems

    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Morris Yau

    Abstract: Linear dynamical systems are the foundational statistical model upon which control theory is built. Both the celebrated Kalman filter and the linear quadratic regulator require knowledge of the system dynamics to provide analytic guarantees. Naturally, learning the dynamics of a linear dynamical system from linear measurements has been intensively studied since Rudolph Kalman's pioneering work in… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

  30. arXiv:2210.12899  [pdf, other

    cs.NE cs.CV

    SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks

    Authors: Abhishek Moitra, Abhiroop Bhattacharjee, Runcong Kuang, Gokul Krishnan, Yu Cao, Priyadarshini Panda

    Abstract: SNNs are an active research domain towards energy efficient machine intelligence. Compared to conventional ANNs, SNNs use temporal spike data and bio-plausible neuronal activation functions such as Leaky-Integrate Fire/Integrate Fire (LIF/IF) for data processing. However, SNNs incur significant dot-product operations causing high memory and computation overhead in standard von-Neumann computing pl… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: 14 pages, 22 figures

  31. arXiv:2207.11903  [pdf, ps, other

    cs.DS cs.LG cs.SI math.PR stat.ML

    Minimax Rates for Robust Community Detection

    Authors: Allen Liu, Ankur Moitra

    Abstract: In this work, we study the problem of community detection in the stochastic block model with adversarial node corruptions. Our main result is an efficient algorithm that can tolerate an $ε$-fraction of corruptions and achieves error $O(ε) + e^{-\frac{C}{2} (1 \pm o(1))}$ where $C = (\sqrt{a} - \sqrt{b})^2$ is the signal-to-noise ratio and $a/n$ and $b/n$ are the inter-community and intra-community… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: To appear in FOCS 2022

  32. arXiv:2207.02841  [pdf, other

    cs.DS

    From algorithms to connectivity and back: finding a giant component in random k-SAT

    Authors: Zongchen Chen, Nitya Mani, Ankur Moitra

    Abstract: We take an algorithmic approach to studying the solution space geometry of relatively sparse random and bounded degree $k$-CNFs for large $k$. In the course of doing so, we establish that with high probability, a random $k$-CNF $Φ$ with $n$ variables and clause density $α= m/n \lesssim 2^{k/6}$ has a giant component of solutions that are connected in a graph where solutions are adjacent if they ha… ▽ More

    Submitted 15 July, 2022; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: 41 pages, 1 figure

    MSC Class: 68W20; 68W25; 68W40; 68Q87 ACM Class: G.3; F.2.2

  33. arXiv:2206.14754  [pdf, other

    cs.LG

    Distilling Model Failures as Directions in Latent Space

    Authors: Saachi Jain, Hannah Lawrence, Ankur Moitra, Aleksander Madry

    Abstract: Existing methods for isolating hard subpopulations and spurious correlations in datasets often require human intervention. This can make these methods labor-intensive and dataset-specific. To address these shortcomings, we present a scalable method for automatically distilling a model's failure modes. Specifically, we harness linear classifiers to identify consistent error patterns, and, in turn,… ▽ More

    Submitted 2 December, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

  34. Examining the Robustness of Spiking Neural Networks on Non-ideal Memristive Crossbars

    Authors: Abhiroop Bhattacharjee, Youngeun Kim, Abhishek Moitra, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have recently emerged as the low-power alternative to Artificial Neural Networks (ANNs) owing to their asynchronous, sparse, and binary information processing. To improve the energy-efficiency and throughput, SNNs can be implemented on memristive crossbars where Multiply-and-Accumulate (MAC) operations are realized in the analog domain using emerging Non-Volatile-Mem… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

    Comments: Accepted in ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), 2022

  35. arXiv:2206.03446  [pdf, ps, other

    cs.LG cs.AI cs.DS math.OC stat.ML

    Learning in Observable POMDPs, without Computationally Intractable Oracles

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g. deterministic transitions) or assume access to an oracle for solving a hard optimistic planni… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  36. arXiv:2205.14284  [pdf, other

    stat.ML cs.DS cs.LG econ.EM

    Provably Auditing Ordinary Least Squares in Low Dimensions

    Authors: Ankur Moitra, Dhruv Rohatgi

    Abstract: Measuring the stability of conclusions derived from Ordinary Least Squares linear regression is critically important, but most metrics either only measure local stability (i.e. against infinitesimal changes in the data), or are only interpretable under statistical assumptions. Recent work proposes a simple, global, finite-sample stability metric: the minimum number of samples that need to be remov… ▽ More

    Submitted 5 June, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 32 pages, 4 figures. Added acknowledgments/funding

  37. arXiv:2204.05422  [pdf, other

    cs.NE cs.AR

    SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks

    Authors: Ruokai Yin, Abhishek Moitra, Abhiroop Bhattacharjee, Youngeun Kim, Priyadarshini Panda

    Abstract: Spiking Neural Networks (SNNs) have gained huge attention as a potential energy-efficient alternative to conventional Artificial Neural Networks (ANNs) due to their inherent high-sparsity activation. Recently, SNNs with backpropagation through time (BPTT) have achieved a higher accuracy result on image recognition tasks than other SNN training algorithms. Despite the success from the algorithm per… ▽ More

    Submitted 19 December, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: 13 pages. Accepted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2022)

  38. MIME: Adapting a Single Neural Network for Multi-task Inference with Memory-efficient Dynamic Pruning

    Authors: Abhiroop Bhattacharjee, Yeshwanth Venkatesha, Abhishek Moitra, Priyadarshini Panda

    Abstract: Recent years have seen a paradigm shift towards multi-task learning. This calls for memory and energy-efficient solutions for inference in a multi-task scenario. We propose an algorithm-hardware co-design approach called MIME. MIME reuses the weight parameters of a trained parent task and learns task-specific threshold parameters for inference on multiple child tasks. We find that MIME results in… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted in Design Automation Conference (DAC), 2022

    Journal ref: 59th Design Automation Conference (DAC), 2022

  39. arXiv:2202.04271  [pdf, other

    cs.CV

    Adversarial Detection without Model Information

    Authors: Abhishek Moitra, Youngeun Kim, Priyadarshini Panda

    Abstract: Prior state-of-the-art adversarial detection works are classifier model dependent, i.e., they require classifier model outputs and parameters for training the detector or during adversarial detection. This makes their detection approach classifier model specific. Furthermore, classifier model outputs and parameters might not always be accessible. To this end, we propose a classifier model independ… ▽ More

    Submitted 5 April, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: This paper has 14 pages of content and 2 pages of references

  40. arXiv:2202.03133  [pdf, other

    cs.NE cs.CV

    Rate Coding or Direct Coding: Which One is Better for Accurate, Robust, and Energy-efficient Spiking Neural Networks?

    Authors: Youngeun Kim, Hyoungseob Park, Abhishek Moitra, Abhiroop Bhattacharjee, Yeshwanth Venkatesha, Priyadarshini Panda

    Abstract: Recent Spiking Neural Networks (SNNs) works focus on an image classification task, therefore various coding techniques have been proposed to convert an image into temporal binary spikes. Among them, rate coding and direct coding are regarded as prospective candidates for building a practical SNN system as they show state-of-the-art performance on large-scale datasets. Despite their usage, there is… ▽ More

    Submitted 12 April, 2022; v1 submitted 31 January, 2022; originally announced February 2022.

    Comments: Accepted to ICASSP2022

  41. arXiv:2201.04735  [pdf, ps, other

    cs.LG cs.DS math.OC stat.ML

    Planning in Observable POMDPs in Quasipolynomial Time

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: Partially Observable Markov Decision Processes (POMDPs) are a natural and general model in reinforcement learning that take into account the agent's uncertainty about its current state. In the literature on POMDPs, it is customary to assume access to a planning oracle that computes an optimal policy when the parameters are known, even though the problem is known to be computationally hard. Almost… ▽ More

    Submitted 23 March, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

    Comments: 52 pages

  42. arXiv:2112.06380  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Robust Voting Rules from Algorithmic Robust Statistics

    Authors: Allen Liu, Ankur Moitra

    Abstract: Maximum likelihood estimation furnishes powerful insights into voting theory, and the design of voting rules. However the MLE can usually be badly corrupted by a single outlying sample. This means that a single voter or a group of colluding voters can vote strategically and drastically affect the outcome. Motivated by recent progress in algorithmic robust statistics, we revisit the fundamental pro… ▽ More

    Submitted 16 July, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

  43. arXiv:2111.06395  [pdf, ps, other

    stat.ML cs.DS cs.LG eess.SY

    Kalman Filtering with Adversarial Corruptions

    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

    Abstract: Here we revisit the classic problem of linear quadratic estimation, i.e. estimating the trajectory of a linear dynamical system from noisy measurements. The celebrated Kalman filter gives an optimal estimator when the measurement noise is Gaussian, but is widely known to break down when one deviates from this assumption, e.g. when the noise is heavy-tailed. Many ad hoc heuristics have been employe… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 57 pages, comments welcome

  44. arXiv:2110.13052  [pdf, ps, other

    cs.LG cs.AI cs.DS math.OC stat.ML

    Can Q-Learning be Improved with Advice?

    Authors: Noah Golowich, Ankur Moitra

    Abstract: Despite rapid progress in theoretical reinforcement learning (RL) over the last few years, most of the known guarantees are worst-case in nature, failing to take advantage of structure that may be known a priori about a given RL problem at hand. In this paper we address the question of whether worst-case lower bounds for regret in online learning of Markov decision processes (MDPs) can be circumve… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  45. DetectX -- Adversarial Input Detection using Current Signatures in Memristive XBar Arrays

    Authors: Abhishek Moitra, Priyadarshini Panda

    Abstract: Adversarial input detection has emerged as a prominent technique to harden Deep Neural Networks(DNNs) against adversarial attacks. Most prior works use neural network-based detectors or complex statistical analysis for adversarial detection. These approaches are computationally intensive and vulnerable to adversarial attacks. To this end, we propose DetectX - a hardware friendly adversarial detect… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

    Comments: 14 pages, 13 figures

    Journal ref: IEEE Transactions on Circuits and Systems 1- Regular Papers, 2021

  46. arXiv:2106.08393  [pdf, ps, other

    cs.LG cs.CR cs.DS

    Spoofing Generalization: When Can't You Trust Proprietary Models?

    Authors: Ankur Moitra, Elchanan Mossel, Colin Sandon

    Abstract: In this work, we study the computational complexity of determining whether a machine learning model that perfectly fits the training data will generalizes to unseen data. In particular, we study the power of a malicious agent whose goal is to construct a model g that fits its training data and nothing else, but is indistinguishable from an accurate model f. We say that g strongly spoofs f if no po… ▽ More

    Submitted 23 March, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

  47. arXiv:2106.02774  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Robust Model Selection and Nearly-Proper Learning for GMMs

    Authors: Jerry Li, Allen Liu, Ankur Moitra

    Abstract: In learning theory, a standard assumption is that the data is generated from a finite mixture model. But what happens when the number of components is not known in advance? The problem of estimating the number of components, also called model selection, is important in its own right but there are essentially no known efficient algorithms with provable guarantees let alone ones that can tolerate ad… ▽ More

    Submitted 22 April, 2023; v1 submitted 4 June, 2021; originally announced June 2021.

  48. arXiv:2106.02680  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Algorithms from Invariants: Smoothed Analysis of Orbit Recovery over $SO(3)$

    Authors: Allen Liu, Ankur Moitra

    Abstract: In this work we study orbit recovery over $SO(3)$, where the goal is to recover a function on the sphere from noisy, randomly rotated copies of it. We assume that the function is a linear combination of low-degree spherical harmonics. This is a natural abstraction for the problem of recovering the three-dimensional structure of a molecule through cryo-electron tomography. For provably learning the… ▽ More

    Submitted 1 May, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

  49. Efficiency-driven Hardware Optimization for Adversarially Robust Neural Networks

    Authors: Abhiroop Bhattacharjee, Abhishek Moitra, Priyadarshini Panda

    Abstract: With a growing need to enable intelligence in embedded devices in the Internet of Things (IoT) era, secure hardware implementation of Deep Neural Networks (DNNs) has become imperative. We will focus on how to address adversarial robustness for DNNs through efficiency-driven hardware optimizations. Since memory (specifically, dot-product operations) is a key energy-spending component for DNNs, hard… ▽ More

    Submitted 9 May, 2021; originally announced May 2021.

    Comments: 6 pages, 8 figures, 3 tables; Accepted in DATE 2021 conference. arXiv admin note: text overlap with arXiv:2008.11298

    Journal ref: 2021 Design, Automation and Test in Europe (DATE) Conference

  50. arXiv:2104.09665  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Learning GMMs with Nearly Optimal Robustness Guarantees

    Authors: Allen Liu, Ankur Moitra

    Abstract: In this work we solve the problem of robustly learning a high-dimensional Gaussian mixture model with $k$ components from $ε$-corrupted samples up to accuracy $\widetilde{O}(ε)$ in total variation distance for any constant $k$ and with mild assumptions on the mixture. This robustness guarantee is optimal up to polylogarithmic factors. The main challenge is that most earlier works rely on learning… ▽ More

    Submitted 14 November, 2021; v1 submitted 19 April, 2021; originally announced April 2021.