Skip to main content

Showing 1–14 of 14 results for author: Rajopadhye, S

.
  1. arXiv:2401.12071  [pdf, other

    cs.AR

    An Irredundant and Compressed Data Layout to Optimize Bandwidth Utilization of FPGA Accelerators

    Authors: Corentin Ferry, Nicolas Derumigny, Steven Derrien, Sanjay Rajopadhye

    Abstract: Memory bandwidth is known to be a performance bottleneck for FPGA accelerators, especially when they deal with large multi-dimensional data-sets. A large body of work focuses on reducing of off-chip transfers, but few authors try to improve the efficiency of transfers. This paper addresses the later issue by proposing (i) a compiler-based approach to accelerator's data layout to maximize contiguou… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 11 pages, 11 figures, 2 tables

  2. arXiv:2312.03646  [pdf, other

    cs.PL

    An Irredundant Decomposition of Data Flow with Affine Dependences

    Authors: Corentin Ferry, Steven Derrien, Sanjay Rajopadhye

    Abstract: Optimization pipelines targeting polyhedral programs try to maximize the compute throughput. Traditional approaches favor reuse and temporal locality; while the communicated volume can be low, failure to optimize spatial locality may cause a low I/O performance. Memory allocation schemes using data partitioning such as data tiling can improve the spatial locality, but they are domain-specific an… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 11 pages, 9 figures

  3. arXiv:2309.11826  [pdf, other

    cs.PL

    Maximal Simplification of Polyhedral Reductions

    Authors: Louis Narmour, Tomofumi Yuki, Sanjay Rajopadhye

    Abstract: Reductions combine multiple input values with an associative operator to produce a single (or multiple) result(s). When the same input value contributes to multiple outputs, there is an opportunity to reuse partial results, enabling reduction simplification. Simplification produces a program with lower asymptotic complexity. It is well known that reductions in polyhedral programs may be simplified… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  4. arXiv:2211.15933  [pdf, other

    cs.PL cs.DC

    Maximal Atomic irRedundant Sets: a Usage-based Dataflow Partitioning Algorithm

    Authors: Corentin Ferry, Steven Derrien, Sanjay Rajopadhye

    Abstract: Programs admitting a polyhedral representation can be transformed in many ways for locality and parallelism, notably loop tiling. Data flow analysis can then compute dependence relations between iterations and between tiles. When tiling is applied, certain iteration-wise dependences cross tile boundaries, creating the need for inter-tile data communication. Previous work computes it as the flow-in… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: 8 pages, 5 figures, 1 table

  5. arXiv:2202.09512  [pdf, other

    cs.DC

    Distributed non-negative RESCAL with Automatic Model Selection for Exascale Data

    Authors: Manish Bhattarai, Namita Kharat, Erik Skau, Benjamin Nebgen, Hristo Djidjev, Sanjay Rajopadhye, James P. Smith, Boian Alexandrov

    Abstract: With the boom in the development of computer hardware and software, social media, IoT platforms, and communications, there has been an exponential growth in the volume of data produced around the world. Among these data, relational datasets are growing in popularity as they provide unique insights regarding the evolution of communities and their interactions. Relational datasets are naturally non-… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  6. arXiv:2202.05933  [pdf, other

    cs.AR

    Increasing FPGA Accelerators Memory Bandwidth with a Burst-Friendly Memory Layout

    Authors: Corentin Ferry, Tomofumi Yuki, Steven Derrien, Sanjay Rajopadhye

    Abstract: Offloading compute-intensive kernels to hardware accelerators relies on the large degree of parallelism offered by these platforms. However, the effective bandwidth of the memory interface often causes a bottleneck, hindering the accelerator's effective performance. Techniques enabling data reuse, such as tiling, lower the pressure on memory traffic but still often leave the accelerators I/O-bound… ▽ More

    Submitted 23 February, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: 16 pages; 17 figures

    ACM Class: B.4.4

  7. arXiv:2010.03074  [pdf, ps, other

    cs.PL

    On Simplifying Dependent Polyhedral Reductions

    Authors: Sanjay Rajopadhye

    Abstract: \emph{Reductions} combine collections of input values with an associative (and usually also commutative) operator to produce either a single, or a collection of outputs. They are ubiquitous in computing, especially with big data and deep learning. When the \emph{same} input value contributes to multiple output values, there is a tremendous opportunity for reducing (pun intended) the computational… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

  8. arXiv:1912.12189  [pdf, other

    cs.PL cs.LO cs.SE

    LLOV: A Fast Static Data-Race Checker for OpenMP Programs

    Authors: Utpal Bora, Santanu Das, Pankaj Kukreja, Saurabh Joshi, Ramakrishna Upadrasta, Sanjay Rajopadhye

    Abstract: In the era of Exascale computing, writing efficient parallel programs is indispensable and at the same time, writing sound parallel programs is very difficult. Specifying parallelism with frameworks such as OpenMP is relatively easy, but data races in these programs are an important source of bugs. In this paper, we propose LLOV, a fast, lightweight, language agnostic, and static data race checker… ▽ More

    Submitted 1 September, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: Accepted in ACM TACO, August 2020

    ACM Class: D.2; D.3

  9. arXiv:1904.01235  [pdf, other

    q-bio.BM q-bio.GN

    BPPart and BPMax: RNA-RNA Interaction Partition Function and Structure Prediction for the Base Pair Counting Model

    Authors: Ali Ebrahimpour-Boroojeny, Sanjay Rajopadhye, Hamidreza Chitsaz

    Abstract: RNA-RNA interaction (RRI) is ubiquitous and has complex roles in the cellular functions. In human health studies, miRNA-target and lncRNAs are among an elite class of RRIs that have been extensively studied. Bacterial ncRNA-target and RNA interference are other classes of RRIs that have received significant attention. In recent studies, mRNA-mRNA interaction instances have been observed, where bot… ▽ More

    Submitted 7 August, 2020; v1 submitted 2 April, 2019; originally announced April 2019.

  10. arXiv:1802.01957  [pdf, other

    cs.PF cs.PL

    Analytical Cost Metrics : Days of Future Past

    Authors: Nirmal Prajapati, Sanjay Rajopadhye, Hristo Djidjev

    Abstract: As we move towards the exascale era, the new architectures must be capable of running the massive computational problems efficiently. Scientists and researchers are continuously investing in tuning the performance of extreme-scale computational problems. These problems arise in almost all areas of computing, ranging from big data analytics, artificial intelligence, search, machine learning, virtua… ▽ More

    Submitted 5 February, 2018; originally announced February 2018.

  11. arXiv:1802.00166  [pdf, other

    cs.PL

    PCOT: Cache Oblivious Tiling of Polyhedral Programs

    Authors: Waruna Ranasinghe, Nirmal Prajapati, Tomofumi Yuki, Sanjay Rajopadhye

    Abstract: This paper studies two variants of tiling: iteration space tiling (or loop blocking) and cache-oblivious methods that recursively split the iteration space with divide-and-conquer. The key question to answer is when we should be using one over the other. The answer to this question is complicated for modern architecture due to a number of reasons. In this paper, we present a detailed empirical stu… ▽ More

    Submitted 1 February, 2018; originally announced February 2018.

  12. arXiv:1712.04892  [pdf, other

    cs.AR

    Accelerator Codesign as Non-Linear Optimization

    Authors: Nirmal Prajapati, Sanjay Rajopadhye, Hristo Djidjev, Nandkishore Santhi, Tobias Grosser, Rumen Andonov

    Abstract: We propose an optimization approach for determining both hardware and software parameters for the efficient implementation of a (family of) applications called dense stencil computations on programmable GPGPUs. We first introduce a simple, analytical model for the silicon area usage of accelerator architectures and a workload characterization of stencil computations. We combine this characterizati… ▽ More

    Submitted 13 December, 2017; originally announced December 2017.

    Comments: 10 pages, 4 figures, 2 tables

  13. arXiv:1610.07236  [pdf, other

    cs.PL cs.DC cs.PF

    Hybrid Static/Dynamic Schedules for Tiled Polyhedral Programs

    Authors: Tian **, Nirmal Prajapati, Waruna Ranasinghe, Guillaume Iooss, Yun Zou, Sanjay Rajopadhye, David Wonnacott

    Abstract: Polyhedral compilers perform optimizations such as tiling and parallelization; when doing both, they usually generate code that executes "barrier-synchronized wavefronts" of tiles. We present a system to express and generate code for hybrid schedules, where some constraints are automatically satisfied through the structure of the code, and the remainder are dynamically enforced at run-time with da… ▽ More

    Submitted 23 October, 2016; originally announced October 2016.

  14. arXiv:1311.4305  [pdf, ps, other

    cs.DC cs.PL

    Checking Race Freedom of Clocked X10 Programs

    Authors: Tomofumi Yuki, Paul Feautrier, Sanjay Rajopadhye, Vijay Saraswat

    Abstract: One of many approaches to better take advantage of parallelism, which has now become mainstream, is the introduction of parallel programming languages. However, parallelism is by nature non-deterministic, and not all parallel bugs can be avoided by language design. This paper proposes a method for guaranteeing absence of data races in the polyhedral subset of clocked X10 programs. Clocks in X10… ▽ More

    Submitted 18 November, 2013; originally announced November 2013.

    Comments: 11 pages