Skip to main content

Showing 1–50 of 68 results for author: Reddi, V J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02877  [pdf, other

    cs.LG cs.DC

    FedStaleWeight: Buffered Asynchronous Federated Learning with Fair Aggregation via Staleness Reweighting

    Authors: Jeffrey Ma, Alan Tu, Yiling Chen, Vijay Janapa Reddi

    Abstract: Federated Learning (FL) endeavors to harness decentralized data while preserving privacy, facing challenges of performance, scalability, and collaboration. Asynchronous Federated Learning (AFL) methods have emerged as promising alternatives to their synchronous counterparts bounded by the slowest agent, yet they add additional challenges in convergence guarantees, fairness with respect to compute… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2405.00892  [pdf, other

    cs.CV cs.AI

    Wake Vision: A Large-scale, Diverse Dataset and Benchmark Suite for TinyML Person Detection

    Authors: Colby Banbury, Emil Njor, Matthew Stewart, Pete Warden, Manjunath Kudlur, Nat Jeffries, Xenofon Fafoutis, Vijay Janapa Reddi

    Abstract: Tiny machine learning (TinyML), which enables machine learning applications on extremely low-power devices, suffers from limited size and quality of relevant datasets. To address this issue, we introduce Wake Vision, a large-scale, diverse dataset tailored for person detection, the canonical task for TinyML visual sensing. Wake Vision comprises over 6 million images, representing a hundredfold inc… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  3. arXiv:2403.12075  [pdf, other

    cs.CY cs.AI cs.CR cs.CV cs.LG

    Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation

    Authors: Jessica Quaye, Alicia Parrish, Oana Inel, Charvi Rastogi, Hannah Rose Kirk, Minsuk Kahng, Erin van Liemt, Max Bartolo, Jess Tsang, Justin White, Nathan Clement, Rafael Mosquera, Juan Ciro, Vijay Janapa Reddi, Lora Aroyo

    Abstract: With the rise of text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on ``implicitly adversarial'' prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativit… ▽ More

    Submitted 13 May, 2024; v1 submitted 14 February, 2024; originally announced March 2024.

    Comments: 10 pages, 6 figures

  4. arXiv:2402.11183  [pdf, other

    cs.CY cs.HC

    Materiality and Risk in the Age of Pervasive AI Sensors

    Authors: Matthew Stewart, Emanuel Moss, Pete Warden, Brian Plancher, Susan Kennedy, Mona Sloane, Vijay Janapa Reddi

    Abstract: Artificial intelligence systems connected to sensor-laden devices are becoming pervasive, which has significant implications for a range of AI risks, including to privacy, the environment, autonomy, and more. There is therefore a growing need for increased accountability around the responsible development and deployment of these technologies. In this paper, we provide a comprehensive analysis of t… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  5. arXiv:2311.13028  [pdf, other

    cs.LG cs.AI cs.DC eess.SP

    DMLR: Data-centric Machine Learning Research -- Past, Present and Future

    Authors: Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, Max Bartolo, William A Gaviria Rojas, Ryan Hileman, Rainier Aliment, Michael W. Mahoney, Meg Risdal, Matthew Lease, Wojciech Samek, Debojyoti Dutta, Curtis G Northcutt, Cody Coleman, Braden Hancock, Bernard Koch, Girmaw Abebe Tadesse, Bojan Karlaš , et al. (13 additional authors not shown)

    Abstract: Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods tow… ▽ More

    Submitted 1 June, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Published in the Journal of Data-centric Machine Learning Research (DMLR) at https://data.mlr.press/assets/pdf/v01-5.pdf

  6. arXiv:2310.07854  [pdf, other

    cs.RO

    VaPr: Variable-Precision Tensors to Accelerate Robot Motion Planning

    Authors: Yu-Shun Hsiao, Siva Kumar Sastry Hari, Balakumar Sundaralingam, Jason Yik, Thierry Tambe, Charbel Sakr, Stephen W. Keckler, Vijay Janapa Reddi

    Abstract: High-dimensional motion generation requires numerical precision for smooth, collision-free solutions. Typically, double-precision or single-precision floating-point (FP) formats are utilized. Using these for big tensors imposes a strain on the memory bandwidth provided by the devices and alters the memory footprint, hence limiting their applicability to low-power edge devices needed for mobile rob… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 7 pages, 5 figures, 8 tables, to be published in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  7. arXiv:2309.09212  [pdf, other

    cs.RO

    RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

    Authors: Víctor Mayoral-Vilches, Jason Jabbour, Yu-Shun Hsiao, Zishen Wan, Martiño Crespo-Álvarez, Matthew Stewart, Juan Manuel Reina-Muñoz, Prateek Nagras, Gaurav Vikhe, Mohammad Bakhshalipour, Martin Pinzger, Stefan Rass, Smruti Panigrahi, Giulio Corradi, Niladri Roy, Phillip B. Gibbons, Sabrina M. Neuman, Brian Plancher, Vijay Janapa Reddi

    Abstract: We introduce RobotPerf, a vendor-agnostic benchmarking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which measures performance by eliminating upper layers and re… ▽ More

    Submitted 29 January, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

  8. arXiv:2307.10041  [pdf, other

    cs.RO cs.AR

    BERRY: Bit Error Robustness for Energy-Efficient Reinforcement Learning-Based Autonomous Systems

    Authors: Zishen Wan, Nandhini Chandramoorthy, Karthik Swaminathan, Pin-Yu Chen, Vijay Janapa Reddi, Arijit Raychowdhury

    Abstract: Autonomous systems, such as Unmanned Aerial Vehicles (UAVs), are expected to run complex reinforcement learning (RL) models to execute fully autonomous position-navigation-time tasks within stringent onboard weight and power constraints. We observe that reducing onboard operating voltage can benefit the energy efficiency of both the computation and flight mission, however, it can also result in on… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted in 2023 60th IEEE/ACM Design Automation Conference (DAC)

  9. arXiv:2306.09481  [pdf, other

    cs.AR cs.ET cs.LG cs.NE

    Leveraging Residue Number System for Designing High-Precision Analog Deep Neural Network Accelerators

    Authors: Cansu Demirkiran, Rashmi Agrawal, Vijay Janapa Reddi, Darius Bunandar, Ajay Joshi

    Abstract: Achieving high accuracy, while maintaining good energy efficiency, in analog DNN accelerators is challenging as high-precision data converters are expensive. In this paper, we overcome this challenge by using the residue number system (RNS) to compose high-precision operations from multiple low-precision operations. This enables us to eliminate the information loss caused by the limited precision… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  10. arXiv:2306.08888  [pdf, other

    cs.AR cs.LG

    ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design

    Authors: Srivatsan Krishnan, Amir Yazdanbaksh, Shvetank Prakash, Jason Jabbour, Ikechukwu Uchendu, Susobhan Ghosh, Behzad Boroujerdian, Daniel Richins, Devashree Tripathy, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: Machine learning is a prevalent approach to tame the complexity of design space exploration for domain-specific architectures. Using ML for design space exploration poses challenges. First, it's not straightforward to identify the suitable algorithm from an increasing pool of ML methods. Second, assessing the trade-offs between performance and sample efficiency across these methods is inconclusive… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: International Symposium on Computer Architecture (ISCA 2023)

  11. arXiv:2306.08848  [pdf, other

    cs.LG cs.CY cs.HC

    Datasheets for Machine Learning Sensors: Towards Transparency, Auditability, and Responsibility for Intelligent Sensing

    Authors: Matthew Stewart, Pete Warden, Yasmine Omri, Shvetank Prakash, Joao Santos, Shawn Hymel, Benjamin Brown, Jim MacArthur, Nat Jeffries, Sachin Katti, Brian Plancher, Vijay Janapa Reddi

    Abstract: Machine learning (ML) sensors are enabling intelligence at the edge by empowering end-users with greater control over their data. ML sensors offer a new paradigm for sensing that moves the processing and analysis to the device itself rather than relying on the cloud, bringing benefits like lower latency and greater data privacy. The rise of these intelligent edge devices, while revolutionizing are… ▽ More

    Submitted 16 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  12. arXiv:2305.14384  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models

    Authors: Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora Aroyo

    Abstract: The generative AI revolution in recent years has been spurred by an expansion in compute power and data quantity, which together enable extensive pre-training of powerful text-to-image (T2I) models. With their greater capabilities to generate realistic and creative content, these T2I models like DALL-E, MidJourney, Imagen or Stable Diffusion are reaching ever wider audiences. Any unsafe behaviors… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    MSC Class: 14J68 (Primary)

  13. arXiv:2304.04640  [pdf, other

    cs.AI

    NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems

    Authors: Jason Yik, Korneel Van den Berghe, Douwe den Blanken, Younes Bouhadjar, Maxime Fabre, Paul Hueber, Denis Kleyko, Noah Pacik-Nelson, Pao-Sheng Vincent Sun, Guangzhi Tang, Shenqi Wang, Biyan Zhou, Soikat Hasan Ahmed, George Vathakkattil Joseph, Benedetto Leto, Aurora Micheli, Anurag Kumar Mishra, Gregor Lenz, Tao Sun, Zergham Ahmed, Mahmoud Akl, Brian Anderson, Andreas G. Andreou, Chiara Bartolozzi, Arindam Basu , et al. (73 additional authors not shown)

    Abstract: Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neu… ▽ More

    Submitted 17 January, 2024; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: Updated from whitepaper to full perspective article preprint

  14. arXiv:2301.11899  [pdf

    cs.LG cs.AR cs.CY

    Is TinyML Sustainable? Assessing the Environmental Impacts of Machine Learning on Microcontrollers

    Authors: Shvetank Prakash, Matthew Stewart, Colby Banbury, Mark Mazumder, Pete Warden, Brian Plancher, Vijay Janapa Reddi

    Abstract: The sustained growth of carbon emissions and global waste elicits significant sustainability concerns for our environment's future. The growing Internet of Things (IoT) has the potential to exacerbate this issue. However, an emerging area known as Tiny Machine Learning (TinyML) has the opportunity to help address these environmental challenges through sustainable computing practices. TinyML, the d… ▽ More

    Submitted 21 November, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Communications of the ACM (CACM) November 2023 Issue

  15. arXiv:2301.10904  [pdf, other

    cs.CR cs.DC cs.LG

    GPU-based Private Information Retrieval for On-Device Machine Learning Inference

    Authors: Maximilian Lam, Jeff Johnson, Wenjie Xiong, Kiwan Maeng, Udit Gupta, Yang Li, Liangzhen Lai, Ilias Leontiadis, Minsoo Rhu, Hsien-Hsin S. Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, G. Edward Suh

    Abstract: On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the or… ▽ More

    Submitted 25 September, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  16. arXiv:2212.03332  [pdf, other

    cs.DC cs.LG cs.SE

    Edge Impulse: An MLOps Platform for Tiny Machine Learning

    Authors: Shawn Hymel, Colby Banbury, Daniel Situnayake, Alex Elium, Carl Ward, Mat Kelcey, Mathijs Baaijens, Mateusz Majchrzycki, Jenny Plunkett, David Tischler, Alessandro Grande, Louis Moreau, Dmitry Maslov, Artie Beavis, Jan Jongboom, Vijay Janapa Reddi

    Abstract: Edge Impulse is a cloud-based machine learning operations (MLOps) platform for develo** embedded and edge ML (TinyML) systems that can be deployed to a wide range of hardware targets. Current TinyML workflows are plagued by fragmented software stacks and heterogeneous deployment hardware, making ML model optimizations difficult and unportable. We present Edge Impulse, a practical MLOps platform… ▽ More

    Submitted 28 April, 2023; v1 submitted 2 November, 2022; originally announced December 2022.

  17. arXiv:2211.16385  [pdf, other

    cs.AR cs.AI cs.LG cs.MA

    Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration

    Authors: Srivatsan Krishnan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, Aleksandra Faust

    Abstract: Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. As the systems grow in complexity, fine-tuning architectural parameters across multiple sub-systems (e.g., datapath, memory blocks in different hierarchies, interconnects, compiler optimization, etc.) quickly results in a combinatorial explosion of design s… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: Workshop on ML for Systems at NeurIPS 2022

  18. arXiv:2211.08675  [pdf, other

    cs.LG cs.ET

    XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse

    Authors: Hyoukjun Kwon, Krishnakumar Nair, Jamin Seo, Jason Yik, Debabrata Mohapatra, Dongyuan Zhan, **ook Song, Peter Capak, Peizhao Zhang, Peter Vajda, Colby Banbury, Mark Mazumder, Liangzhen Lai, Ashish Sirasao, Tushar Krishna, Harshit Khaitan, Vikas Chandra, Vijay Janapa Reddi

    Abstract: Real-time multi-task multi-model (MTMM) workloads, a new form of deep learning inference workloads, are emerging for applications areas like extended reality (XR) to support metaverse use cases. These workloads combine user interactivity with computationally complex machine learning (ML) activities. Compared to standard ML applications, these ML workloads present unique difficulties and constraint… ▽ More

    Submitted 19 May, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

  19. arXiv:2207.10062  [pdf, other

    cs.LG

    DataPerf: Benchmarks for Data-Centric AI Development

    Authors: Mark Mazumder, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, Lynn He, Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Douwe Kiela, David Jurado, David Kanter, Rafael Mosquera, Juan Ciro, Lora Aroyo, Bilge Acun, Lingjiao Chen, Mehul Smriti Raje, Max Bartolo, Sabri Eyuboglu, Amirata Ghorbani, Emmett Goodman , et al. (20 additional authors not shown)

    Abstract: Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing datase… ▽ More

    Submitted 13 October, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  20. arXiv:2207.07958  [pdf, other

    cs.LG physics.comp-ph physics.ins-det

    FastML Science Benchmarks: Accelerating Real-Time Scientific Edge Machine Learning

    Authors: Javier Duarte, Nhan Tran, Ben Hawks, Christian Herwig, Jules Muhizi, Shvetank Prakash, Vijay Janapa Reddi

    Abstract: Applications of machine learning (ML) are growing by the day for many unique and challenging scientific applications. However, a crucial challenge facing these applications is their need for ultra low-latency and on-detector ML capabilities. Given the slowdown in Moore's law and Dennard scaling, coupled with the rapid advances in scientific instrumentation that is resulting in growing data rates,… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

    Comments: 9 pages, 4 figures, Contribution to 3rd Workshop on Benchmarking Machine Learning Workloads on Emerging Hardware (MLBench) at 5th Conference on Machine Learning and Systems (MLSys)

    Report number: FERMILAB-CONF-22-534-PPD-SCD

  21. arXiv:2206.03266  [pdf, other

    cs.LG cs.AR eess.SP

    Machine Learning Sensors

    Authors: Pete Warden, Matthew Stewart, Brian Plancher, Colby Banbury, Shvetank Prakash, Emma Chen, Zain Asgar, Sachin Katti, Vijay Janapa Reddi

    Abstract: Machine learning sensors represent a paradigm shift for the future of embedded machine learning applications. Current instantiations of embedded machine learning (ML) suffer from complex integration, lack of modularity, and privacy and security concerns from data movement. This article proposes a more data-centric paradigm for embedding sensor intelligence on edge devices to combat these challenge… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  22. arXiv:2205.07149  [pdf, other

    cs.RO

    Robotic Computing on FPGAs: Current Progress, Research Challenges, and Opportunities

    Authors: Zishen Wan, Ashwin Lele, Bo Yu, Shaoshan Liu, Yu Wang, Vijay Janapa Reddi, Cong Hao, Arijit Raychowdhury

    Abstract: Robotic computing has reached a tip** point, with a myriad of robots (e.g., drones, self-driving cars, logistic robots) being widely applied in diverse scenarios. The continuous proliferation of robotics, however, critically depends on efficient computing substrates, driven by real-time requirements, robotic size-weight-and-power constraints, cybersecurity considerations, and dynamically changin… ▽ More

    Submitted 14 May, 2022; originally announced May 2022.

    Comments: 2022 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), June 13-15, 2022, Incheon, Korea

  23. arXiv:2205.05748  [pdf, other

    cs.LG cs.RO

    Tiny Robot Learning: Challenges and Directions for Machine Learning in Resource-Constrained Robots

    Authors: Sabrina M. Neuman, Brian Plancher, Bardienus P. Duisterhof, Srivatsan Krishnan, Colby Banbury, Mark Mazumder, Shvetank Prakash, Jason Jabbour, Aleksandra Faust, Guido C. H. E. de Croon, Vijay Janapa Reddi

    Abstract: Machine learning (ML) has become a pervasive tool across computing systems. An emerging application that stress-tests the challenges of ML system design is tiny robot learning, the deployment of ML on resource-constrained low-cost autonomous robots. Tiny robot learning lies at the intersection of embedded systems, robotics, and ML, compounding the challenges of these domains. Tiny robot learning i… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: 4 pages, 3 figures, 1 table, in IEEE AICAS 2022

  24. arXiv:2205.03929  [pdf, other

    cs.RO

    RobotCore: An Open Architecture for Hardware Acceleration in ROS 2

    Authors: Víctor Mayoral-Vilches, Sabrina M. Neuman, Brian Plancher, Vijay Janapa Reddi

    Abstract: Hardware acceleration can revolutionize robotics, enabling new applications by speeding up robot response times while remaining power-efficient. However, the diversity of acceleration options makes it difficult for roboticists to easily deploy accelerated systems without expertise in each specific hardware platform. In this work, we address this challenge with RobotCore, an architecture to integra… ▽ More

    Submitted 30 June, 2023; v1 submitted 8 May, 2022; originally announced May 2022.

  25. arXiv:2205.03347  [pdf, other

    cs.AI cs.RO

    Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles

    Authors: Yu-Shun Hsiao, Siva Kumar Sastry Hari, Michał Filipiuk, Timothy Tsai, Michael B. Sullivan, Vijay Janapa Reddi, Vasu Singh, Stephen W. Keckler

    Abstract: The processing requirement of autonomous vehicles (AVs) for high-accuracy perception in complex scenarios can exceed the resources offered by the in-vehicle computer, degrading safety and comfort. This paper proposes a sensor frame processing rate (FPR) estimation model, Zhuyi, that quantifies the minimum safe FPR continuously in a driving scenario. Zhuyi can be employed post-deployment as an onli… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: 2022 Design Automation Conference (DAC), July 10-14, 2022, San Francisco

  26. arXiv:2205.03325  [pdf, other

    cs.AR cs.RO

    OMU: A Probabilistic 3D Occupancy Map** Accelerator for Real-time OctoMap at the Edge

    Authors: Tianyu Jia, En-Yu Yang, Yu-Shun Hsiao, Jonathan Cruz, David Brooks, Gu-Yeon Wei, Vijay Janapa Reddi

    Abstract: Autonomous machines (e.g., vehicles, mobile robots, drones) require sophisticated 3D map** to perceive the dynamic environment. However, maintaining a real-time 3D map is expensive both in terms of compute and memory requirements, especially for resource-constrained edge machines. Probabilistic OctoMap is a reliable and memory-efficient 3D dense map model to represent the full environment, with… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: 2022 Design Automation and Test in Europe Conference (DATE), March 14-23, 2022, Virtual

  27. arXiv:2204.10898  [pdf, other

    cs.RO cs.AR

    Roofline Model for UAVs: A Bottleneck Analysis Tool for Onboard Compute Characterization of Autonomous Unmanned Aerial Vehicles

    Authors: Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Ninad Jadhav, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: We introduce an early-phase bottleneck analysis and characterization model called the F-1 for designing computing systems that target autonomous Unmanned Aerial Vehicles (UAVs). The model provides insights by exploiting the fundamental relationships between various components in the autonomous UAV, such as sensor, compute, and body dynamics. To guarantee safe operation while maximizing the perform… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: To Appear in 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). arXiv admin note: substantial text overlap with arXiv:2111.03792

  28. arXiv:2203.07276  [pdf, other

    cs.LG cs.AR

    FRL-FI: Transient Fault Analysis for Federated Reinforcement Learning-Based Navigation Systems

    Authors: Zishen Wan, Aqeel Anwar, Abdulrahman Mahmoud, Tianyu Jia, Yu-Shun Hsiao, Vijay Janapa Reddi, Arijit Raychowdhury

    Abstract: Swarm intelligence is being increasingly deployed in autonomous systems, such as drones and unmanned vehicles. Federated reinforcement learning (FRL), a key swarm intelligence paradigm where agents interact with their own environments and cooperatively learn a consensus policy while preserving privacy, has recently shown potential advantages and gained popularity. However, transient faults are inc… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 2022 Design Automation and Test in Europe Conference (DATE), March 14-23, 2022, Virtual

  29. arXiv:2203.02833  [pdf, other

    cs.CR cs.AI

    Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference

    Authors: Maximilian Lam, Michael Mitzenmacher, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks

    Abstract: Multiparty computation approaches to secure neural network inference commonly rely on garbled circuits for securely executing nonlinear activation functions. However, garbled circuits require excessive communication between server and client, impose significant storage overheads, and incur large runtime penalties. To reduce these costs, we propose an alternative to garbled circuits: Tabula, an alg… ▽ More

    Submitted 16 June, 2024; v1 submitted 5 March, 2022; originally announced March 2022.

  30. arXiv:2201.05232  [pdf, other

    cs.AR

    FARSI: Facebook AR System Investigator for Agile Domain-Specific System-on-Chip Exploration

    Authors: Behzad Boroujerdian, Ying **g, Amit Kumar, Lavanya Subramanian, Luke Yen, Vincent Lee, Vivek Venkatesan, Amit **dal, Robert Shearer, Vijay Janapa Reddi

    Abstract: Domain-specific SoCs (DSSoCs) are attractive solutions for domains with stringent power/performance/area constraints; however, they suffer from two fundamental complexities. On the one hand, their many specialized hardware blocks result in complex systems and thus high development effort. On the other, their many system knobs expand the complexity of design space, making the search for the optimal… ▽ More

    Submitted 17 January, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

  31. CFU Playground: Full-Stack Open-Source Framework for Tiny Machine Learning (tinyML) Acceleration on FPGAs

    Authors: Shvetank Prakash, Tim Callahan, Joseph Bushagour, Colby Banbury, Alan V. Green, Pete Warden, Tim Ansell, Vijay Janapa Reddi

    Abstract: Need for the efficient processing of neural networks has given rise to the development of hardware accelerators. The increased adoption of specialized hardware has highlighted the need for more agile design flows for hardware-software co-design and domain-specific optimizations. In this paper, we present CFU Playground: a full-stack open-source framework that enables rapid and iterative design and… ▽ More

    Submitted 5 April, 2023; v1 submitted 5 January, 2022; originally announced January 2022.

    Journal ref: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). (2023) 157-167

  32. arXiv:2111.09344  [pdf, other

    cs.LG stat.ML

    The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

    Authors: Daniel Galvez, Greg Diamos, Juan Ciro, Juan Felipe Cerón, Keith Achorn, Anjali Gopi, David Kanter, Maximilian Lam, Mark Mazumder, Vijay Janapa Reddi

    Abstract: The People's Speech is a free-to-download 30,000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset). The data is collected via searching the Internet for appropriately licensed audio data with existing transcriptions. We describe our data collection methodology and release our data collection… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: Part of 2021 Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks

  33. arXiv:2111.04957  [pdf, other

    cs.RO

    Analyzing and Improving Fault Tolerance of Learning-Based Navigation Systems

    Authors: Zishen Wan, Aqeel Anwar, Yu-Shun Hsiao, Tianyu Jia, Vijay Janapa Reddi, Arijit Raychowdhury

    Abstract: Learning-based navigation systems are widely used in autonomous applications, such as robotics, unmanned vehicles and drones. Specialized hardware accelerators have been proposed for high-performance and energy-efficiency for such navigational tasks. However, transient and permanent faults are increasing in hardware systems and can catastrophically violate tasks safety. Meanwhile, traditional redu… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

    Comments: Accepted in 58th ACM/IEEE Design Automation Conference (DAC), 2021

  34. arXiv:2111.03792   

    cs.RO

    Roofline Model for UAVs:A Bottleneck Analysis Tool for Designing Compute Systems for Autonomous Drones

    Authors: Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Aleksandra Faust, Vijay Janapa Reddi

    Abstract: We present a bottleneck analysis tool for designing compute systems for autonomous Unmanned Aerial Vehicles (UAV). The tool provides insights by exploiting the fundamental relationships between various components in the autonomous UAV such as sensor, compute, body dynamics. To guarantee safe operation while maximizing the performance (e.g., velocity) of the UAV, the compute, sensor, and other mech… ▽ More

    Submitted 15 June, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: The latest and updated version with conference is available here: arXiv:2204.10898

  35. arXiv:2110.01406  [pdf

    cs.LG cs.DC cs.PF cs.SE

    MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

    Authors: Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan , et al. (17 additional authors not shown)

    Abstract: Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf,… ▽ More

    Submitted 28 December, 2021; v1 submitted 29 September, 2021; originally announced October 2021.

  36. GRiD: GPU-Accelerated Rigid Body Dynamics with Analytical Gradients

    Authors: Brian Plancher, Sabrina M. Neuman, Radhika Ghosal, Scott Kuindersma, Vijay Janapa Reddi

    Abstract: We introduce GRiD: a GPU-accelerated library for computing rigid body dynamics with analytical gradients. GRiD was designed to accelerate the nonlinear trajectory optimization subproblem used in state-of-the-art robotic planning, control, and machine learning, which requires tens to hundreds of naturally parallel computations of rigid body dynamics and their gradients at each iteration. GRiD lever… ▽ More

    Submitted 25 February, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: Camera Ready Update: 8 pages, 5 figures, 1 data table, 2 algorithm blocks

  37. arXiv:2109.05683  [pdf, ps, other

    cs.RO cs.AR

    AutoSoC: Automating Algorithm-SOC Co-design for Aerial Robots

    Authors: Srivatsan Krishnan, Thierry Tambe, Zishen Wan, Vijay Janapa Reddi

    Abstract: Aerial autonomous machines (Drones) has a plethora of promising applications and use cases. While the popularity of these autonomous machines continues to grow, there are many challenges, such as endurance and agility, that could hinder the practical deployment of these machines. The closed-loop control frequency must be high to achieve high agility. However, given the resource-constrained nature… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Comments: Class Project for CS249r: Special Topics on Edge Computing (Autonomous Machines)

  38. arXiv:2109.01126  [pdf, other

    cs.AR cs.ET

    An Electro-Photonic System for Accelerating Deep Neural Networks

    Authors: Cansu Demirkiran, Furkan Eris, Gongyu Wang, Jonathan Elmhurst, Nick Moore, Nicholas C. Harris, Ayon Basumallik, Vijay Janapa Reddi, Ajay Joshi, Darius Bunandar

    Abstract: The number of parameters in deep neural networks (DNNs) is scaling at about 5$\times$ the rate of Moore's Law. To sustain this growth, photonic computing is a promising avenue, as it enables higher throughput in dominant general matrix-matrix multiplication (GEMM) operations in DNNs than their electrical counterpart. However, purely photonic systems face several challenges including lack of photon… ▽ More

    Submitted 16 December, 2022; v1 submitted 2 September, 2021; originally announced September 2021.

    Journal ref: J. Emerg. Technol. Comput. Syst. 19, 4, Article 30 (October 2023)

  39. arXiv:2108.13354  [pdf, other

    cs.RO

    RoboRun: A Robot Runtime to Exploit Spatial Heterogeneity

    Authors: Behzad Boroujerdian, Radhika Ghosal, Jonathan Cruz, Brian Plancher, Vijay Janapa Reddi

    Abstract: The limited onboard energy of autonomous mobile robots poses a tremendous challenge for practical deployment. Hence, efficient computing solutions are imperative. A crucial shortcoming of state-of-the-art computing solutions is that they ignore the robot's operating environment heterogeneity and make static, worst-case assumptions. As this heterogeneity impacts the system's computing payload, an o… ▽ More

    Submitted 30 August, 2021; originally announced August 2021.

    Comments: will be published in Design Automation Conference (DAC) 2021

  40. arXiv:2107.05490  [pdf, other

    cs.RO

    Sniffy Bug: A Fully Autonomous Swarm of Gas-Seeking Nano Quadcopters in Cluttered Environments

    Authors: Bardienus P. Duisterhof, Shushuai Li, Javier Burgués, Vijay Janapa Reddi, Guido C. H. E. de Croon

    Abstract: Nano quadcopters are ideal for gas source localization (GSL) as they are safe, agile and inexpensive. However, their extremely restricted sensors and computational resources make GSL a daunting challenge. In this work, we propose a novel bug algorithm named `Sniffy Bug', which allows a fully autonomous swarm of gas-seeking nano quadcopters to localize a gas source in an unknown, cluttered and GPS-… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

  41. arXiv:2106.07597  [pdf, other

    cs.LG cs.AR

    MLPerf Tiny Benchmark

    Authors: Colby Banbury, Vijay Janapa Reddi, Peter Torelli, Jeremy Holleman, Nat Jeffries, Csaba Kiraly, Pietro Montino, David Kanter, Sebastian Ahmed, Danilo Pau, Urmish Thakker, Antonio Torrini, Peter Warden, Jay Cordaro, Giuseppe Di Guglielmo, Javier Duarte, Stephen Gibellini, Videet Parekh, Honson Tran, Nhan Tran, Niu Wenxu, Xu Xuesong

    Abstract: Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted and easily reproducible benchmark for these systems. To meet this need, we present MLPerf Tiny, the first industry-standard benchmark suite for ultra-low-power tiny machine learning systems. The… ▽ More

    Submitted 24 August, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: TinyML Benchmark

  42. arXiv:2106.06089  [pdf, other

    cs.CR cs.AI

    Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix

    Authors: Maximilian Lam, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi, Michael Mitzenmacher

    Abstract: We show that aggregated model updates in federated learning may be insecure. An untrusted central server may disaggregate user updates from sums of updates across participants given repeated observations, enabling the server to recover privileged information about individual users' private training data via traditional gradient inference attacks. Our method revolves around reconstructing participa… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  43. arXiv:2106.04008  [pdf, other

    cs.LG

    Widening Access to Applied Machine Learning with TinyML

    Authors: Vijay Janapa Reddi, Brian Plancher, Susan Kennedy, Laurence Moroney, Pete Warden, Anant Agarwal, Colby Banbury, Massimo Banzi, Matthew Bennett, Benjamin Brown, Sharad Chitlangia, Radhika Ghosal, Sarah Grafman, Rupert Jaeger, Srivatsan Krishnan, Maximilian Lam, Daniel Leiker, Cara Mann, Mark Mazumder, Dominic Pajak, Dhilan Ramaprasad, J. Evan Smith, Matthew Stewart, Dustin Tingley

    Abstract: Broadening access to both computational and educational resources is critical to diffusing machine-learning (ML) innovation. However, today, most ML resources and experts are siloed in a few countries and organizations. In this paper, we describe our pedagogical approach to increasing access to applied ML through a massive open online course (MOOC) on Tiny Machine Learning (TinyML). We suggest tha… ▽ More

    Submitted 9 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Understanding the underpinnings of the TinyML edX course series: https://www.edx.org/professional-certificate/harvardx-tiny-machine-learning

  44. arXiv:2105.12882  [pdf, other

    cs.RO

    MAVFI: An End-to-End Fault Analysis Framework with Anomaly Detection and Recovery for Micro Aerial Vehicles

    Authors: Yu-Shun Hsiao, Zishen Wan, Tianyu Jia, Radhika Ghosal, Abdulrahman Mahmoud, Arijit Raychowdhury, David Brooks, Gu-Yeon Wei, Vijay Janapa Reddi

    Abstract: Safety and resilience are critical for autonomous unmanned aerial vehicles (UAVs). We introduce MAVFI, the micro aerial vehicles (MAVs) resilience analysis methodology to assess the effect of silent data corruption (SDC) on UAVs' mission metrics, such as flight time and success rate, for accurately measuring system resilience. To enhance the safety and resilience of robot systems bound by size, we… ▽ More

    Submitted 30 January, 2023; v1 submitted 26 May, 2021; originally announced May 2021.

    Comments: 6 pages, 9 figures; The first two authors have equal contributions; Accepted as a conference paper in DATE 2023

  45. Few-Shot Keyword Spotting in Any Language

    Authors: Mark Mazumder, Colby Banbury, Josh Meyer, Pete Warden, Vijay Janapa Reddi

    Abstract: We introduce a few-shot transfer learning method for keyword spotting in any language. Leveraging open speech corpora in nine languages, we automate the extraction of a large multilingual keyword bank and use it to train an embedding model. With just five training examples, we fine-tune the embedding model for keyword spotting and achieve an average F1 score of 0.75 on keyword classification for 1… ▽ More

    Submitted 9 September, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Journal ref: Proc. Interspeech 2021

  46. arXiv:2102.11447  [pdf, other

    cs.LG

    Data Engineering for Everyone

    Authors: Vijay Janapa Reddi, Greg Diamos, Pete Warden, Peter Mattson, David Kanter

    Abstract: Data engineering is one of the fastest-growing fields within machine learning (ML). As ML becomes more common, the appetite for data grows more ravenous. But ML requires more data than individual teams of data engineers can readily produce, which presents a severe challenge to ML deployment at scale. Much like the software-engineering revolution, where mass adoption of open-source software replace… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

  47. arXiv:2102.04285  [pdf, other

    cs.LG cs.SE

    RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads

    Authors: James Gleeson, Srivatsan Krishnan, Moshe Gabel, Vijay Janapa Reddi, Eyal de Lara, Gennady Pekhimenko

    Abstract: Deep reinforcement learning (RL) has made groundbreaking advancements in robotics, data center management and other applications. Unfortunately, system-level bottlenecks in RL workloads are poorly understood; we observe fundamental structural differences in RL workloads that make them inherently less GPU-bound than supervised learning (SL). To explain where training time is spent in RL workloads,… ▽ More

    Submitted 4 March, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: RL-Scope is an open-source tool available at https://github.com/UofT-EcoSystem/rlscope . Proceedings of the 4th MLSys Conference, 2021. Changes: camera ready for MLSys publication -- shorten abstract, add acknowledgements, minor grammar fixes

  48. arXiv:2102.02988  [pdf, other

    cs.RO cs.AI cs.AR cs.LG

    AutoPilot: Automating SoC Design Space Exploration for SWaP Constrained Autonomous UAVs

    Authors: Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Paul Whatmough, Aleksandra Faust, Sabrina Neuman, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi

    Abstract: Building domain-specific accelerators for autonomous unmanned aerial vehicles (UAVs) is challenging due to a lack of systematic methodology for designing onboard compute. Balancing a computing system for a UAV requires considering both the cyber (e.g., sensor rate, compute performance) and physical (e.g., payload weight) characteristics that affect overall performance. Iterating over the many comp… ▽ More

    Submitted 10 September, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

  49. arXiv:2012.02328  [pdf, other

    cs.LG cs.DC

    MLPerf Mobile Inference Benchmark

    Authors: Vijay Janapa Reddi, David Kanter, Peter Mattson, Jared Duke, Thai Nguyen, Ramesh Chukka, Ken Shiring, Koan-Sin Tan, Mark Charlebois, William Chou, Mostafa El-Khamy, Jungwook Hong, Tom St. John, Cindy Trinh, Michael Buch, Mark Mazumder, Relia Markovic, Thomas Atta, Fatih Cakir, Masoud Charkhabi, Xiaodong Chen, Cheng-Ming Chiang, Dave Dexter, Terry Heo, Gunther Schmuelling , et al. (2 additional authors not shown)

    Abstract: This paper presents the first industry-standard open-source machine learning (ML) benchmark to allow perfor mance and accuracy evaluation of mobile devices with different AI chips and software stacks. The benchmark draws from the expertise of leading mobile-SoC vendors, ML-framework providers, and model producers. It comprises a suite of models that operate with standard data sets, quality metrics… ▽ More

    Submitted 6 April, 2022; v1 submitted 3 December, 2020; originally announced December 2020.

  50. arXiv:2010.11267  [pdf, other

    cs.LG

    MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

    Authors: Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough

    Abstract: Executing machine learning workloads locally on resource constrained microcontrollers (MCUs) promises to drastically expand the application space of IoT. However, so-called TinyML presents severe technical challenges, as deep neural network inference demands a large compute and memory budget. To address this challenge, neural architecture search (NAS) promises to help design accurate ML models tha… ▽ More

    Submitted 12 April, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: 10 pages, 8 figures, 3 tables