Search | arXiv e-print repository

An Automated Validation Framework for Power Management and Data Retention Logic Kits of Standard Cell Library

Authors: Akshay Karkal Kamath, Bharath Kumar, Sunil Aggarwal, Subramanian Parameswaran, Parag Lonkar, Debi Prasanna, Somasunder Sreenath

Abstract: The development of a standard cell library involves characterization of a number of gate-level circuits at various cell-level abstractions. Verifying the behavior of these cells largely depends on the manual skills of the circuit designers. Especially challenging are the power management and data retention cells which must be checked thoroughly for voltage and power configurations in addition to t… ▽ More The development of a standard cell library involves characterization of a number of gate-level circuits at various cell-level abstractions. Verifying the behavior of these cells largely depends on the manual skills of the circuit designers. Especially challenging are the power management and data retention cells which must be checked thoroughly for voltage and power configurations in addition to their logic functionality. Also, when standard cells are extracted into various models, any inconsistencies in these models typically goes unchecked during library development. Thus, validating these cells exhaustively prior to customer delivery is highly advantageous to not only improve customer satisfaction but also to reduce design costs. We address this challenge by presenting a methodology to validate the power management and data retention cells that are used in the logical design flow of low-power chips. For a quick adoption by standard cell library design teams, the framework is fully automated and runs out-of-the-box. The proposed framework has been implemented and deployed within the Samsung Foundry ecosystem to enhance the overall quality of library design kit deliverables. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 33rd Design and Verification Conference and Exhibition United States (DVCon U.S. 2021)

arXiv:2404.09356 [pdf, other]

LLeMpower: Understanding Disparities in the Control and Access of Large Language Models

Authors: Vishwas Sathish, Hannah Lin, Aditya K Kamath, Anish Nyayachavadi

Abstract: Large Language Models (LLMs) are a powerful technology that augment human skill to create new opportunities, akin to the development of steam engines and the internet. However, LLMs come with a high cost. They require significant computing resources and energy to train and serve. Inequity in their control and access has led to concentration of ownership and power to a small collection of corporati… ▽ More Large Language Models (LLMs) are a powerful technology that augment human skill to create new opportunities, akin to the development of steam engines and the internet. However, LLMs come with a high cost. They require significant computing resources and energy to train and serve. Inequity in their control and access has led to concentration of ownership and power to a small collection of corporations. In our study, we collect training and inference requirements for various LLMs. We then analyze the economic strengths of nations and organizations in the context of develo** and serving these models. Additionally, we also look at whether individuals around the world can access and use this emerging technology. We compare and contrast these groups to show that these technologies are monopolized by a surprisingly few entities. We conclude with a qualitative study on the ethical implications of our findings and discuss future directions towards equity in LLM access. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 11 total pages, 7 page text, 4 page references, 3 figures (with subfigures), 1 table

ACM Class: K.4.0; K.7.4

arXiv:2401.08908 [pdf, other]

Herding LLaMaS: Using LLMs as an OS Module

Authors: Aditya K Kamath, Sujay Yadalam

Abstract: Computer systems are becoming increasingly heterogeneous with the emergence of new memory technologies and compute devices. GPUs alongside CPUs have become commonplace and CXL is poised to be a mainstay of cloud systems. The operating system is responsible for managing these hardware resources, requiring modification every time a new device is released. Years of research and development are sunk i… ▽ More Computer systems are becoming increasingly heterogeneous with the emergence of new memory technologies and compute devices. GPUs alongside CPUs have become commonplace and CXL is poised to be a mainstay of cloud systems. The operating system is responsible for managing these hardware resources, requiring modification every time a new device is released. Years of research and development are sunk into tuning the OS for high performance with each new heterogeneous device. With the recent explosion in memory technologies and domain-specific accelerators, it would be beneficial to have an OS that could provide high performance for new devices without significant effort. We propose LLaMaS which can adapt to new devices easily. LLaMaS uses Large Language Models (LLMs) to extract the useful features of new devices from their textual description and uses these features to make operating system decisions at runtime. Adding support to LLaMaS for a new device is as simple as describing the system and new device properties in plaintext. LLaMaS reduces the burden on system administrators to enable easy integration of new devices into production systems. Preliminary evaluation using ChatGPT shows that LLMs are capable of extracting device features from text and make correct OS decisions based on those features. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: ASPLOS 2023, Wild and Crazy Ideas session

arXiv:2312.00647 [pdf, other]

MaxMem: Colocation and Performance for Big Data Applications on Tiered Main Memory Servers

Authors: Amanda Raybuck, Wei Zhang, Kayvan Mansoorshahi, Aditya K. Kamath, Mattan Erez, Simon Peter

Abstract: We present MaxMem, a tiered main memory management system that aims to maximize Big Data application colocation and performance. MaxMem uses an application-agnostic and lightweight memory occupancy control mechanism based on fast memory miss ratios to provide application QoS under increasing colocation. By relying on memory access sampling and binning to quickly identify per-process memory heat gr… ▽ More We present MaxMem, a tiered main memory management system that aims to maximize Big Data application colocation and performance. MaxMem uses an application-agnostic and lightweight memory occupancy control mechanism based on fast memory miss ratios to provide application QoS under increasing colocation. By relying on memory access sampling and binning to quickly identify per-process memory heat gradients, MaxMem maximizes performance for many applications sharing tiered main memory simultaneously. MaxMem is designed as a user-space memory manager to be easily modifiable and extensible, without complex kernel code development. On a system with tiered main memory consisting of DRAM and Intel Optane persistent memory modules, our evaluation confirms that MaxMem provides 11% and 38% better throughput and up to 80% and an order of magnitude lower 99th percentile latency than HeMem and Linux AutoNUMA, respectively, with a Big Data key-value store in dynamic colocation scenarios. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 12 pages, 10 figures

arXiv:2210.08578 [pdf, other]

doi 10.1145/3566097.3567864

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

Authors: Yimeng Zhang, Akshay Karkal Kamath, Qiucheng Wu, Zhiwen Fan, Wuyang Chen, Zhangyang Wang, Shiyu Chang, Sijia Liu, Cong Hao

Abstract: In this paper, we propose a data-model-hardware tri-design framework for high-throughput, low-cost, and high-accuracy multi-object tracking (MOT) on High-Definition (HD) video stream. First, to enable ultra-light video intelligence, we propose temporal frame-filtering and spatial saliency-focusing approaches to reduce the complexity of massive video data. Second, we exploit structure-aware weight… ▽ More In this paper, we propose a data-model-hardware tri-design framework for high-throughput, low-cost, and high-accuracy multi-object tracking (MOT) on High-Definition (HD) video stream. First, to enable ultra-light video intelligence, we propose temporal frame-filtering and spatial saliency-focusing approaches to reduce the complexity of massive video data. Second, we exploit structure-aware weight sparsity to design a hardware-friendly model compression method. Third, assisted with data and model complexity reduction, we propose a sparsity-aware, scalable, and low-power accelerator design, aiming to deliver real-time performance with high energy efficiency. Different from existing works, we make a solid step towards the synergized software/hardware co-optimization for realistic MOT model implementation. Compared to the state-of-the-art MOT baseline, our tri-design approach can achieve 12.5x latency reduction, 20.9x effective frame rate improvement, 5.83x lower power, and 9.78x better energy efficiency, without much accuracy drop. △ Less

Submitted 17 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

Comments: Accepted to ASP-DAC'23

arXiv:1909.12221 [pdf, other]

Storage Class Memory: Principles, Problems, and Possibilities

Authors: Aditya K Kamath, Leslie Monis, A Tarun Karthik, Basavaraj Talawar

Abstract: Storage Class Memory (SCM) is a class of memory technology which has recently become viable for use. Their namearises from the fact that they exhibit non-volatility of data, similar to secondary storage while also having latencies comparable toprimary memory and byte-addressibility. In this area, Phase Change Memory (PCM), Spin-Transfer-Torque Random Access Memory(STT-RAM), and Resistive RAM (ReRA… ▽ More Storage Class Memory (SCM) is a class of memory technology which has recently become viable for use. Their namearises from the fact that they exhibit non-volatility of data, similar to secondary storage while also having latencies comparable toprimary memory and byte-addressibility. In this area, Phase Change Memory (PCM), Spin-Transfer-Torque Random Access Memory(STT-RAM), and Resistive RAM (ReRAM) have emerged as the major contenders for commercial and industrial use. In this paper, wedescribe how these memory types function, while highlighting the problems of endurance and performance that these memory typesface. We also discuss the future possibilities of Multi-Level Cells (MLCs), as well as how SCM can be used to construct accelerators. △ Less

Submitted 26 September, 2019; originally announced September 2019.

arXiv:1909.03543 [pdf, other]

An Experimental Study of Structural Diversity in Social Networks

Authors: Jessica Su, Krishna Kamath, Aneesh Sharma, Johan Ugander, Sharad Goel

Abstract: Several recent studies of online social networking platforms have found that adoption rates and engagement levels are positively correlated with structural diversity, the degree of heterogeneity among an individual's contacts as measured by network ties. One common theory for this observation is that structural diversity increases utility, in part because there is value to interacting with people… ▽ More Several recent studies of online social networking platforms have found that adoption rates and engagement levels are positively correlated with structural diversity, the degree of heterogeneity among an individual's contacts as measured by network ties. One common theory for this observation is that structural diversity increases utility, in part because there is value to interacting with people from different network components on the same platform. While compelling, evidence for this causal theory comes from observational studies, making it difficult to rule out non-causal explanations. We investigate the role of structural diversity on retention by conducting a large-scale randomized controlled study on the Twitter platform. We first show that structural diversity correlates with user retention on Twitter, corroborating results from past observational studies. We then exogenously vary structural diversity by altering the set of network recommendations new users see when joining the platform; we confirm that this design induces the desired changes to network topology. We find, however, that low, medium, and high structural diversity treatment groups in our experiment have comparable retention rates. Thus, at least in this case, the observed correlation between structural diversity and retention does not appear to result from a causal relationship, challenging theories based on past observational studies. △ Less

Submitted 9 September, 2019; v1 submitted 8 September, 2019; originally announced September 2019.

Comments: To appear in the Proceedings of International AAAI Conference on Web and Social Media (ICWSM 2020)

arXiv:1908.01119 [pdf, other]

Optimal Information Updating based on Value of Information

Authors: Rahul Singh, Gopal Krishna Kamath, P. R. Kumar

Abstract: We address the problem of how to optimally schedule data packets over an unreliable channel in order to minimize the estimation error of a simple-to-implement remote linear estimator using a constant "Kalman'' gain to track the state of a Gauss Markov process. The remote estimator receives time-stamped data packets which contain noisy observations of the process. Additionally, they also contain th… ▽ More We address the problem of how to optimally schedule data packets over an unreliable channel in order to minimize the estimation error of a simple-to-implement remote linear estimator using a constant "Kalman'' gain to track the state of a Gauss Markov process. The remote estimator receives time-stamped data packets which contain noisy observations of the process. Additionally, they also contain the information about the "quality'' of the sensor\ source, i.e., the variance of the observation noise that was used to generate the packet. In order to minimize the estimation error, the scheduler needs to use both while prioritizing packet transmissions. It is shown that a simple index rule that calculates the value of information (VoI) of each packet, and then schedules the packet with the largest current value of VoI, is optimal. The VoI of a packet decreases with its age, and increases with the precision of the source. Thus, we conclude that, for constant filter gains, a policy which minimizes the age of information does not necessarily maximize the estimator performance. △ Less

Submitted 3 August, 2019; originally announced August 2019.

Comments: Accepted in Allerton 2019

arXiv:1702.07390 [pdf, other]

doi 10.1145/3041021.3055139

Detecting Strong Ties Using Network Motifs

Authors: Rahmtin Rotabi, Krishna Kamath, Jon Kleinberg, Aneesh Sharma

Abstract: Detecting strong ties among users in social and information networks is a fundamental operation that can improve performance on a multitude of personalization and ranking tasks. Strong-tie edges are often readily obtained from the social network as users often participate in multiple overlap** networks via features such as following and messaging. These networks may vary greatly in size, density… ▽ More Detecting strong ties among users in social and information networks is a fundamental operation that can improve performance on a multitude of personalization and ranking tasks. Strong-tie edges are often readily obtained from the social network as users often participate in multiple overlap** networks via features such as following and messaging. These networks may vary greatly in size, density and the information they carry. This setting leads to a natural strong tie detection task: given a small set of labeled strong tie edges, how well can one detect unlabeled strong ties in the remainder of the network? This task becomes particularly daunting for the Twitter network due to scant availability of pairwise relationship attribute data, and sparsity of strong tie networks such as phone contacts. Given these challenges, a natural approach is to instead use structural network features for the task, produced by {\em combining} the strong and "weak" edges. In this work, we demonstrate via experiments on Twitter data that using only such structural network features is sufficient for detecting strong ties with high precision. These structural network features are obtained from the presence and frequency of small network motifs on combined strong and weak ties. We observe that using motifs larger than triads alleviate sparsity problems that arise for smaller motifs, both due to increased combinatorial possibilities as well as benefiting strongly from searching beyond the ego network. Empirically, we observe that not all motifs are equally useful, and need to be carefully constructed from the combined edges in order to be effective for strong tie detection. Finally, we reinforce our experimental findings with providing theoretical justification that suggests why incorporating these larger sized motifs as features could lead to increased performance in planted graph models. △ Less

Submitted 26 March, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

Comments: To appear in Proceedings of WWW 2017 (Web-science track)

arXiv:1702.06673 [pdf, other]

doi 10.1145/3038912.3052647

Cascades: A view from Audience

Authors: Rahmtin Rotabi, Krishna Kamath, Jon Kleinberg, Aneesh Sharma

Abstract: Cascades on online networks have been a popular subject of study in the past decade, and there is a considerable literature on phenomena such as diffusion mechanisms, virality, cascade prediction, and peer network effects. However, a basic question has received comparatively little attention: how desirable are cascades on a social media platform from the point of view of users? While versions of t… ▽ More Cascades on online networks have been a popular subject of study in the past decade, and there is a considerable literature on phenomena such as diffusion mechanisms, virality, cascade prediction, and peer network effects. However, a basic question has received comparatively little attention: how desirable are cascades on a social media platform from the point of view of users? While versions of this question have been considered from the perspective of the producers of cascades, any answer to this question must also take into account the effect of cascades on their audience. In this work, we seek to fill this gap by providing a consumer perspective of cascade. Users on online networks play the dual role of producers and consumers. First, we perform an empirical study of the interaction of Twitter users with retweet cascades. We measure how often users observe retweets in their home timeline, and observe a phenomenon that we term the "Impressions Paradox": the share of impressions for cascades of size k decays much slower than frequency of cascades of size k. Thus, the audience for cascades can be quite large even for rare large cascades. We also measure audience engagement with retweet cascades in comparison to non-retweeted content. Our results show that cascades often rival or exceed organic content in engagement received per impression. This result is perhaps surprising in that consumers didn't opt in to see tweets from these authors. Furthermore, although cascading content is widely popular, one would expect it to eventually reach parts of the audience that may not be interested in the content. Motivated by our findings, we posit a theoretical model that focuses on the effect of cascades on the audience. Our results on this model highlight the balance between retweeting as a high-quality content selection mechanism and the role of network users in filtering irrelevant content. △ Less

Submitted 26 March, 2017; v1 submitted 21 February, 2017; originally announced February 2017.

Showing 1–10 of 10 results for author: Kamath, K