Skip to main content

Showing 1–17 of 17 results for author: Shaikhha, A

.
  1. arXiv:2303.07030  [pdf, other

    cs.PL cs.LG cs.MS

    $\nabla$SD: Differentiable Programming for Sparse Tensors

    Authors: Amir Shaikhha, Mathieu Huot, Shideh Hashemian

    Abstract: Sparse tensors are prevalent in many data-intensive applications, yet existing differentiable programming frameworks are tailored towards dense tensors. This presents a significant challenge for efficiently computing gradients through sparse tensor operations, as their irregular sparsity patterns can result in substantial memory and computational overheads. In this work, we introduce a novel frame… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  2. arXiv:2212.10307  [pdf, other

    cs.PL cs.LG cs.MS

    Efficient and Sound Differentiable Programming in a Functional Array-Processing Language

    Authors: Amir Shaikhha, Mathieu Huot, Shabnam Ghasemirad, Andrew Fitzgibbon, Simon Peyton Jones, Dimitrios Vytiniotis

    Abstract: Automatic differentiation (AD) is a technique for computing the derivative of a function represented by a program. This technique is considered as the de-facto standard for computing the differentiation in many machine learning and optimisation software tools. Despite the practicality of this technique, the performance of the differentiated programs, especially for functional languages and in the… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:1806.02136

  3. arXiv:2212.09801  [pdf, ps, other

    cs.PL

    Denotationally Correct, Purely Functional, Efficient Reverse-mode Automatic Differentiation

    Authors: Mathieu Huot, Amir Shaikhha

    Abstract: Reverse-mode differentiation is used for optimization, but it introduces references, which break the purity of the underlying programs, making them notoriously harder to optimize. We present a reverse-mode differentiation on a purely functional language with array operations. It is the first one to deliver a provably efficient, purely functional, and denotationally correct reverse-mode differentia… ▽ More

    Submitted 26 April, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: 34 pages, 17 figures

  4. arXiv:2211.10482  [pdf, other

    cs.PL cs.MS cs.SC

    Compiling Structured Tensor Algebra

    Authors: Mahdi Ghorbani, Mathieu Huot, Shideh Hashemian, Amir Shaikhha

    Abstract: Tensor algebra is essential for data-intensive workloads in various computational domains. Computational scientists face a trade-off between the specialization degree provided by dense tensor algebra and the algorithmic efficiency that leverages the structure provided by sparse tensors. This paper presents StructTensor, a framework that symbolically computes structure at compilation time. This is… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

  5. arXiv:2210.06267  [pdf, other

    cs.DB cs.PL

    Optimizing Tensor Programs on Flexible Storage

    Authors: Maximilian Schleich, Amir Shaikhha, Dan Suciu

    Abstract: Tensor programs often need to process large tensors (vectors, matrices, or higher order tensors) that require a specialized storage format for their memory layout. Several such layouts have been proposed in the literature, such as the Coordinate Format, the Compressed Sparse Row format, and many others, that were especially designed to optimally store tensors with specific sparsity properties. How… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  6. arXiv:2206.04380  [pdf, other

    cs.PL cs.DS

    Hinted Dictionaries: Efficient Functional Ordered Sets and Maps

    Authors: Amir Shaikhha, Mahdi Ghorbani, Hesam Shahrokhi

    Abstract: This article introduces hinted dictionaries for expressing efficient ordered sets and maps functionally. As opposed to the traditional ordered dictionaries with logarithmic operations, hinted dictionaries can achieve better performance by using cursor-like objects referred to as hints. Hinted dictionaries unify the interfaces of imperative ordered dictionaries (e.g., C++ maps) and functional ones… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  7. arXiv:2112.13099  [pdf, other

    cs.DB cs.LG

    Fine-Tuning Data Structures for Analytical Query Processing

    Authors: Amir Shaikhha, Marios Kelepeshis, Mahdi Ghorbani

    Abstract: We introduce a framework for automatically choosing data structures to support efficient computation of analytical workloads. Our contributions are twofold. First, we introduce a novel low-level intermediate language that can express the algorithms behind various query processing paradigms such as classical joins, groupjoin, and in-database machine learning engines. This language is designed aroun… ▽ More

    Submitted 24 December, 2021; originally announced December 2021.

  8. arXiv:2103.06376  [pdf, other

    cs.PL cs.DB cs.LG

    Functional Collection Programming with Semi-Ring Dictionaries

    Authors: Amir Shaikhha, Mathieu Huot, Jaclyn Smith, Dan Olteanu

    Abstract: This paper introduces semi-ring dictionaries, a powerful class of compositional and purely functional collections that subsume other collection types such as sets, multisets, arrays, vectors, and matrices. We developed SDQL, a statically typed language that can express relational algebra with aggregations, linear algebra, and functional collections over data such as relations and matrices using se… ▽ More

    Submitted 22 March, 2022; v1 submitted 10 March, 2021; originally announced March 2021.

  9. arXiv:2012.14743  [pdf, other

    cs.DB cs.LG

    BayesCard: Revitilizing Bayesian Frameworks for Cardinality Estimation

    Authors: Ziniu Wu, Amir Shaikhha, Rong Zhu, Kai Zeng, Yuxing Han, **gren Zhou

    Abstract: Cardinality estimation (CardEst) is an essential component in query optimizers and a fundamental problem in DBMS. A desired CardEst method should attain good algorithm performance, be stable to varied data settings, and be friendly to system deployment. However, no existing CardEst method can fulfill the three criteria at the same time. Traditional methods often have significant algorithm drawback… ▽ More

    Submitted 2 February, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

  10. arXiv:2011.06381  [pdf, other

    cs.DB

    Scalable Querying of Nested Data

    Authors: Jaclyn Smith, Michael Benedikt, Milos Nikolic, Amir Shaikhha

    Abstract: While large-scale distributed data processing platforms have become an attractive target for query processing, these systems are problematic for applications that deal with nested collections. Programmers are forced either to perform non-trivial translations of collection programs or to employ automated flattening procedures, both of which lead to performance problems. These challenges only worsen… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

    ACM Class: H.2.3; H.2.4

  11. arXiv:2001.03541  [pdf, other

    cs.PL cs.DB cs.LG

    Multi-layer Optimizations for End-to-End Data Analytics

    Authors: Amir Shaikhha, Maximilian Schleich, Alexandru Ghita, Dan Olteanu

    Abstract: We consider the problem of training machine learning models over multi-relational data. The mainstream approach is to first construct the training dataset using a feature extraction query over input database and then use a statistical software package of choice to train the model. In this paper we introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative app… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

  12. arXiv:1808.01344  [pdf, other

    cs.PL

    A Compiler-Compiler for DSL Embedding

    Authors: Amir Shaikhha, Vo** Jovanovic, Christoph Koch

    Abstract: In this paper, we present a framework to generate compilers for embedded domain-specific languages (EDSLs). This framework provides facilities to automatically generate the boilerplate code required for building DSL compilers on top of extensible optimizing compilers. We evaluate the practicality of our framework by demonstrating several use-cases successfully built with it.

    Submitted 3 August, 2018; originally announced August 2018.

  13. arXiv:1807.09887  [pdf, other

    cs.DB

    Compiling Database Application Programs

    Authors: Mohammad Dashti, Sachin Basil John, Thierry Coppey, Amir Shaikhha, Vo** Jovanovic, Christoph Koch

    Abstract: There is a trend towards increased specialization of data management software for performance reasons. In this paper, we study the automatic specialization and optimization of database application programs -- sequences of queries and updates, augmented with control flow constructs as they appear in database scripts, UDFs, transactional workloads and triggers in languages such as PL/SQL. We show ho… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: 16 pages

    ACM Class: H.2.4

  14. arXiv:1806.02136  [pdf, other

    cs.MS cs.LG cs.PL cs.SC stat.ML

    Efficient Differentiable Programming in a Functional Array-Processing Language

    Authors: Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, Simon Peyton Jones, Christoph Koch

    Abstract: We present a system for the automatic differentiation of a higher-order functional array-processing language. The core functional language underlying this system simultaneously supports both source-to-source automatic differentiation and global optimizations such as loop transformations. Thanks to this feature, we demonstrate how for some real-world machine learning and computer vision benchmarks,… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

  15. arXiv:1612.05566  [pdf, other

    cs.DB

    Building Efficient Query Engines in a High-Level Language

    Authors: Amir Shaikhha, Yannis Klonatos, Christoph Koch

    Abstract: Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance, instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to… ▽ More

    Submitted 16 December, 2016; originally announced December 2016.

  16. arXiv:1610.09166  [pdf, other

    cs.DB cs.PL

    Push vs. Pull-Based Loop Fusion in Query Engines

    Authors: Amir Shaikhha, Mohammad Dashti, Christoph Koch

    Abstract: Database query engines use pull-based or push-based approaches to avoid the materialization of data across query operators. In this paper, we study these two types of query engines in depth and present the limitations and advantages of each engine. Similarly, the programming languages community has developed loop fusion techniques to remove intermediate collections in the context of collection pro… ▽ More

    Submitted 28 October, 2016; originally announced October 2016.

  17. arXiv:1603.00542  [pdf, other

    cs.DB

    Repairing Conflicts among MVCC Transactions

    Authors: Mohammad Dashti, Sachin Basil John, Amir Shaikhha, Christoph Koch

    Abstract: The optimistic variants of MVCC (Multi-Version Concurrency Control) avoid blocking concurrent transactions at the cost of having a validation phase. Upon failure in the validation phase, the transaction is usually aborted and restarted from scratch. The "abort and restart" approach becomes a performance bottleneck for the use cases with high contention objects or long running transactions. In addi… ▽ More

    Submitted 1 March, 2016; originally announced March 2016.

    Comments: 12 pages, 9 figures

    ACM Class: H.2.4