Skip to main content

Showing 1–11 of 11 results for author: Shen, J P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.11844  [pdf

    cs.AR cs.ET

    NeRTCAM: CAM-Based CMOS Implementation of Reference Frames for Neuromorphic Processors

    Authors: Harideep Nair, William Leyman, Agastya Sampath, Quinn Jacobson, John Paul Shen

    Abstract: Neuromorphic architectures mimicking biological neural networks have been proposed as a much more efficient alternative to conventional von Neumann architectures for the exploding compute demands of AI workloads. Recent neuroscience theory on intelligence suggests that Cortical Columns (CCs) are the fundamental compute units in the neocortex and intelligence arises from CC's ability to store, pred… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted and Presented at Neuro-Inspired Computational Elements (NICE) Conference, La Jolla, CA. 2024

  2. arXiv:2404.15312  [pdf, other

    eess.SP cs.CV

    Realtime Person Identification via Gait Analysis

    Authors: Shanmuga Venkatachalam, Harideep Nair, Prabhu Vellaisamy, Yongqi Zhou, Ziad Youssfi, John Paul Shen

    Abstract: Each person has a unique gait, i.e., walking style, that can be used as a biometric for personal identification. Recent works have demonstrated effective gait recognition using deep neural networks, however most of these works predominantly focus on classification accuracy rather than model efficiency. In order to perform gait recognition using wearable devices on the edge, it is imperative to dev… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  3. arXiv:2402.19376  [pdf, other

    cs.AR

    OzMAC: An Energy-Efficient Sparsity-Exploiting Multiply-Accumulate-Unit Design for DL Inference

    Authors: Harideep Nair, Prabhu Vellaisamy, Tsung-Han Lin, Perry Wang, Shawn Blanton, John Paul Shen

    Abstract: General Matrix Multiply (GEMM) hardware, employing large arrays of multiply-accumulate (MAC) units, perform bulk of the computation in deep learning (DL). Recent trends have established 8-bit integer (INT8) as the most widely used precision for DL inference. This paper proposes a novel MAC design capable of dynamically exploiting bit sparsity (i.e., number of `0' bits within a binary value) in inp… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  4. arXiv:2205.14248  [pdf, other

    cs.ET cs.AR cs.NE

    Towards a Design Framework for TNN-Based Neuromorphic Sensory Processing Units

    Authors: Prabhu Vellaisamy, John Paul Shen

    Abstract: Temporal Neural Networks (TNNs) are spiking neural networks that exhibit brain-like sensory processing with high energy efficiency. This work presents the ongoing research towards develo** a custom design framework for designing efficient application-specific TNN-based Neuromorphic Sensory Processing Units (NSPUs). This paper examines previous works on NSPU designs for UCR time-series clustering… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  5. arXiv:2205.07410  [pdf, other

    cs.AR cs.ET cs.LG cs.NE

    TNN7: A Custom Macro Suite for Implementing Highly Optimized Designs of Neuromorphic TNNs

    Authors: Harideep Nair, Prabhu Vellaisamy, Santha Bhasuthkar, John Paul Shen

    Abstract: Temporal Neural Networks (TNNs), inspired from the mammalian neocortex, exhibit energy-efficient online sensory processing capabilities. Recent works have proposed a microarchitecture framework for implementing TNNs and demonstrated competitive performance on vision and time-series applications. Building on these previous works, this work proposes TNN7, a suite of nine highly optimized custom macr… ▽ More

    Submitted 25 May, 2022; v1 submitted 15 May, 2022; originally announced May 2022.

    Comments: To be published in ISVLSI 2022

  6. arXiv:2105.13262  [pdf, other

    cs.AR cs.ET cs.LG cs.NE

    A Microarchitecture Implementation Framework for Online Learning with Temporal Neural Networks

    Authors: Harideep Nair, John Paul Shen, James E. Smith

    Abstract: Temporal Neural Networks (TNNs) are spiking neural networks that use time as a resource to represent and process information, similar to the mammalian neocortex. In contrast to compute-intensive deep neural networks that employ separate training and inference phases, TNNs are capable of extremely efficient online incremental/continual learning and are excellent candidates for building edge-native… ▽ More

    Submitted 2 June, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

    Comments: To be published in ISVLSI 2021. arXiv admin note: substantial text overlap with arXiv:2009.00457

    Journal ref: 2021 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2021, pp. 266-271

  7. Unsupervised Clustering of Time Series Signals using Neuromorphic Energy-Efficient Temporal Neural Networks

    Authors: Shreyas Chaudhari, Harideep Nair, José M. F. Moura, John Paul Shen

    Abstract: Unsupervised time series clustering is a challenging problem with diverse industrial applications such as anomaly detection, bio-wearables, etc. These applications typically involve small, low-power devices on the edge that collect and process real-time sensory signals. State-of-the-art time-series clustering methods perform some form of loss minimization that is extremely computationally intensiv… ▽ More

    Submitted 18 February, 2021; originally announced February 2021.

    Comments: Accepted for publication at ICASSP 2021

  8. arXiv:2012.05419   

    cs.AR cs.ET cs.LG cs.NE

    A Custom 7nm CMOS Standard Cell Library for Implementing TNN-based Neuromorphic Processors

    Authors: Harideep Nair, Prabhu Vellaisamy, Santha Bhasuthkar, John Paul Shen

    Abstract: A set of highly-optimized custom macro extensions is developed for a 7nm CMOS cell library for implementing Temporal Neural Networks (TNNs) that can mimic brain-like sensory processing with extreme energy efficiency. A TNN prototype (13,750 neurons and 315,000 synapses) for MNIST requires only 1.56mm2 die area and consumes only 1.69mW.

    Submitted 4 June, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

    Comments: This work is dated and will be superseded by a forthcoming work

  9. arXiv:2009.00457  [pdf, other

    cs.AR cs.ET cs.LG cs.NE

    Direct CMOS Implementation of Neuromorphic Temporal Neural Networks for Sensory Processing

    Authors: Harideep Nair, John Paul Shen, James E. Smith

    Abstract: Temporal Neural Networks (TNNs) use time as a resource to represent and process information, mimicking the behavior of the mammalian neocortex. This work focuses on implementing TNNs using off-the-shelf digital CMOS technology. A microarchitecture framework is introduced with a hierarchy of building blocks including: multi-neuron columns, multi-column layers, and multi-layer TNNs. We present the d… ▽ More

    Submitted 27 August, 2020; originally announced September 2020.

    Comments: Submission Under Review for an IEEE Conference

  10. arXiv:1712.01235  [pdf, other

    cs.AI cs.LG

    On the Real-time Vehicle Placement Problem

    Authors: Abhinav Jauhri, Carlee Joe-Wong, John Paul Shen

    Abstract: Motivated by ride-sharing platforms' efforts to reduce their riders' wait times for a vehicle, this paper introduces a novel problem of placing vehicles to fulfill real-time pickup requests in a spatially and temporally changing environment. The real-time nature of this problem makes it fundamentally different from other placement and scheduling problems, as it requires not only real-time placemen… ▽ More

    Submitted 4 December, 2017; originally announced December 2017.

    Comments: Presented at NIPS Workshop on Machine Learning for Intelligent Transportation Systems, 2017

  11. arXiv:1701.06635  [pdf, other

    cs.AI

    Space-Time Graph Modeling of Ride Requests Based on Real-World Data

    Authors: Abhinav Jauhri, Brian Foo, Jerome Berclaz, Chih Chi Hu, Radek Grzeszczuk, Vasu Parameswaran, John Paul Shen

    Abstract: This paper focuses on modeling ride requests and their variations over location and time, based on analyzing extensive real-world data from a ride-sharing service. We introduce a graph model that captures the spatial and temporal variability of ride requests and the potentials for ride pooling. We discover these ride request graphs exhibit a well known property called densification power law often… ▽ More

    Submitted 23 January, 2017; originally announced January 2017.

    Comments: Accepted at AAAI-17 Workshop on AI and OR for Social Good (AIORSocGood-17)

    ACM Class: J.4; I.2.6; K.4.1