Skip to main content

Showing 1–11 of 11 results for author: Upadrasta, R

.
  1. arXiv:2312.00507  [pdf, other

    cs.PL cs.CR cs.LG

    VEXIR2Vec: An Architecture-Neutral Embedding Framework for Binary Similarity

    Authors: S. VenkataKeerthy, Soumya Banerjee, Sayan Dey, Yashas Andaluri, Raghul PS, Subrahmanyam Kalyanasundaram, Fernando Magno Quintão Pereira, Ramakrishna Upadrasta

    Abstract: Binary similarity involves determining whether two binary programs exhibit similar functionality, often originating from the same source code. In this work, we propose VexIR2Vec, an approach for binary similarity using VEX-IR, an architecture-neutral Intermediate Representation (IR). We extract the embeddings from sequences of basic blocks, termed peepholes, derived by random walks on the control-… ▽ More

    Submitted 9 July, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

  2. arXiv:2311.10800  [pdf, other

    cs.PL cs.LG cs.PF

    The Next 700 ML-Enabled Compiler Optimizations

    Authors: S. VenkataKeerthy, Siddharth Jain, Umesh Kalvakuntla, Pranav Sai Gorantla, Rajiv Shailesh Chitale, Eugene Brevdo, Albert Cohen, Mircea Trofin, Ramakrishna Upadrasta

    Abstract: There is a growing interest in enhancing compiler optimizations with ML models, yet interactions between compilers and ML frameworks remain challenging. Some optimizations require tightly coupled models and compiler internals,raising issues with modularity, performance and framework independence. Practical deployment and transparency for the end-user are also important concerns. We propose ML-Comp… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  3. POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning

    Authors: Shalini Jain, Yashas Andaluri, S. VenkataKeerthy, Ramakrishna Upadrasta

    Abstract: The ever increasing memory requirements of several applications has led to increased demands which might not be met by embedded devices. Constraining the usage of memory in such cases is of paramount importance. It is important that such code size improvements should not have a negative impact on the runtime. Improving the execution time while optimizing for code size is a non-trivial but a signif… ▽ More

    Submitted 27 July, 2022; originally announced August 2022.

    Comments: Published in ISPASS-2022

  4. arXiv:2204.02013  [pdf, other

    cs.LG cs.AR cs.PL

    RL4ReAl: Reinforcement Learning for Register Allocation

    Authors: S. VenkataKeerthy, Siddharth Jain, Anilava Kundu, Rohit Aggarwal, Albert Cohen, Ramakrishna Upadrasta

    Abstract: We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art techniques. We formalize the constraints that precisely define the problem for a given instruction-set architecture, while ensuring that the generated code pre… ▽ More

    Submitted 6 February, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Published in CC'23

    ACM Class: D.2; I.2.5

  5. arXiv:2203.09284  [pdf, other

    cs.DS cs.PF

    FUSED-PAGERANK: Loop-Fusion based Approximate PageRank

    Authors: Shalini Jain, Rahul Utkoor, Hemalatha Eedi, Sathya Peri, Ramakrishna Upadrasta

    Abstract: PageRank is a graph centrality metric that gives the importance of each node in a given graph. The PageRank algorithm provides important insights to understand the behavior of nodes through the connections they form with other nodes. It is an iterative algorithm that ranks the nodes in each iteration until all the node values converge. The PageRank algorithm is implemented using sparse storage for… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

  6. arXiv:2111.04259  [pdf, ps, other

    cs.PL cs.SE

    OpenMP aware MHP Analysis for Improved Static Data-Race Detection

    Authors: Utpal Bora, Shraiysh Vaishay, Saurabh Joshi, Ramakrishna Upadrasta

    Abstract: Data races, a major source of bugs in concurrent programs, can result in loss of manpower and time as well as data loss due to system failures. OpenMP, the de facto shared memory parallelism framework used in the HPC community, also suffers from data races. To detect race conditions in OpenMP programs and improve turnaround time and/or developer productivity, we present a data flow analysis based,… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

    Comments: Accepted at LLVM-HPC'21

    ACM Class: D.1.3; D.2.4; D.2.5; D.3.4

  7. arXiv:2104.05573  [pdf, other

    cs.PL

    AI Powered Compiler Techniques for DL Code Optimization

    Authors: Sanket Tavarageri, Gagandeep Goyal, Sasikanth Avancha, Bharat Kaul, Ramakrishna Upadrasta

    Abstract: Creating high performance implementations of deep learning primitives on CPUs is a challenging task. Multiple considerations including multi-level cache hierarchy, and wide SIMD units of CPU platforms influence the choice of program transformations to apply for performance optimization. In this paper, we present machine learning powered compiler techniques to optimize loop nests. We take a two-pro… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: text overlap with arXiv:2006.02230, arXiv:2002.02145

  8. arXiv:2006.02230  [pdf, other

    cs.DC cs.AI cs.PL

    PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives

    Authors: Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Gagandeep Goyal, Ramakrishna Upadrasta, Bharat Kaul

    Abstract: Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of DNNs is becoming ubiquitous including in softwares for image recognition, speech recognition, speech synthesis, language translation, to name a few. he training of DNN architectures however is computationally expensive. Once the model is created, its use in the intended application - the inference task, is comput… ▽ More

    Submitted 17 November, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2002.02145

  9. arXiv:2002.02145  [pdf, other

    cs.PL cs.LG

    PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives

    Authors: Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Gagandeep Goyal, Ramakrishna Upadrasta, Bharat Kaul

    Abstract: At the heart of deep learning training and inferencing are computationally intensive primitives such as convolutions which form the building blocks of deep neural networks. Researchers have taken two distinct approaches to creating high performance implementations of deep learning kernels, namely, 1) library development exemplified by Intel MKL-DNN for CPUs, 2) automatic compilation represented by… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

  10. arXiv:1912.12189  [pdf, other

    cs.PL cs.LO cs.SE

    LLOV: A Fast Static Data-Race Checker for OpenMP Programs

    Authors: Utpal Bora, Santanu Das, Pankaj Kukreja, Saurabh Joshi, Ramakrishna Upadrasta, Sanjay Rajopadhye

    Abstract: In the era of Exascale computing, writing efficient parallel programs is indispensable and at the same time, writing sound parallel programs is very difficult. Specifying parallelism with frameworks such as OpenMP is relatively easy, but data races in these programs are an important source of bugs. In this paper, we propose LLOV, a fast, lightweight, language agnostic, and static data race checker… ▽ More

    Submitted 1 September, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: Accepted in ACM TACO, August 2020

    ACM Class: D.2; D.3

  11. arXiv:1909.06228  [pdf, other

    cs.PL cs.LG cs.NE cs.SE

    IR2Vec: LLVM IR based Scalable Program Embeddings

    Authors: S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta, Y. N. Srikant

    Abstract: We propose IR2Vec, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to capture the syntax as well as the semantics of the input programs. As our infrastructure is based on the Intermediate Representation (IR) of the source cod… ▽ More

    Submitted 1 September, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: Accepted in ACM TACO