Skip to main content

Showing 1–3 of 3 results for author: Iliev, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2104.05217  [pdf, other

    eess.SY cs.LG

    ENOS: Energy-Aware Network Operator Search for Hybrid Digital and Compute-in-Memory DNN Accelerators

    Authors: Shamma Nasrin, Ahish Shylendra, Yuti Kadakia, Nick Iliev, Wilfred Gomes, Theja Tulabandhula, Amit Ranjan Trivedi

    Abstract: This work proposes a novel Energy-Aware Network Operator Search (ENOS) approach to address the energy-accuracy trade-offs of a deep neural network (DNN) accelerator. In recent years, novel inference operators have been proposed to improve the computational efficiency of a DNN. Augmenting the operators, their corresponding novel computing modes have also been explored. However, simplification of DN… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

  2. arXiv:2102.08247   

    cs.RO cs.AR eess.IV

    Probabilistic Localization of Insect-Scale Drones on Floating-Gate Inverter Arrays

    Authors: Priyesh Shukla, Ankith Muralidhar, Nick Iliev, Theja Tulabandhula, Sawyer B. Fuller, Amit Ranjan Trivedi

    Abstract: We propose a novel compute-in-memory (CIM)-based ultra-low-power framework for probabilistic localization of insect-scale drones. The conventional probabilistic localization approaches rely on the three-dimensional (3D) Gaussian Mixture Model (GMM)-based representation of a 3D map. A GMM model with hundreds of mixture functions is typically needed to adequately learn and represent the intricacies… ▽ More

    Submitted 24 May, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Will submit the revised article.

    ACM Class: B.7; I.2.9

  3. arXiv:2011.12839  [pdf

    cs.AR cs.NE eess.SP

    Low Latency CMOS Hardware Acceleration for Fully Connected Layers in Deep Neural Networks

    Authors: Nick Iliev, Amit Ranjan Trivedi

    Abstract: We present a novel low latency CMOS hardware accelerator for fully connected (FC) layers in deep neural networks (DNNs). The FC accelerator, FC-ACCL, is based on 128 8x8 or 16x16 processing elements (PEs) for matrix-vector multiplication, and 128 multiply-accumulate (MAC) units integrated with 128 High Bandwidth Memory (HBM) units for storing the pretrained weights. Micro-architectural details for… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.