Search | arXiv e-print repository

doi 10.1145/3650200.3656632

Distributed Ranges: A Model for Distributed Data Structures, Algorithms, and Views

Authors: Benjamin Brock, Robert Cohn, Suyash Bakshi, Tuomas Karna, Jeongnim Kim, Mateusz Nowak, Łukasz Ślusarczyk, Kacper Stefanski, Timothy G. Mattson

Abstract: Data structures and algorithms are essential building blocks for programs, and \emph{distributed data structures}, which automatically partition data across multiple memory locales, are essential to writing high-level parallel programs. While many projects have designed and implemented C++ distributed data structures and algorithms, there has not been widespread adoption of an interoperable model… ▽ More Data structures and algorithms are essential building blocks for programs, and \emph{distributed data structures}, which automatically partition data across multiple memory locales, are essential to writing high-level parallel programs. While many projects have designed and implemented C++ distributed data structures and algorithms, there has not been widespread adoption of an interoperable model allowing algorithms and data structures from different libraries to work together. This paper introduces distributed ranges, which is a model for building generic data structures, views, and algorithms. A distributed range extends a C++ range, which is an iterable sequence of values, with a concept of segmentation, thus exposing how the distributed range is partitioned over multiple memory locales. Distributed data structures provide this distributed range interface, which allows them to be used with a collection of generic algorithms implemented using the distributed range interface. The modular nature of the model allows for the straightforward implementation of \textit{distributed views}, which are lightweight objects that provide a lazily evaluated view of another range. Views can be composed together recursively and combined with algorithms to implement computational kernels using efficient, flexible, and high-level standard C++ primitives. We evaluate the distributed ranges model by implementing a set of standard concepts and views as well as two execution runtimes, a multi-node, MPI-based runtime and a single-process, multi-GPU runtime. We demonstrate that high-level algorithms implemented using generic, high-level distributed ranges can achieve performance competitive with highly-tuned, expert-written code. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: To appear in ACM International Conference on Supercomputing (ICS) 2024

Journal ref: In Proceedings of the 38th ACM International Conference on Supercomputing (ICS 2024) 236-246

arXiv:2405.13918 [pdf, other]

An Abstraction Hierarchy Toward Productive Quantum Programming

Authors: Olivia Di Matteo, Santiago Núñez-Corrales, Michał Stęchły, Steven P. Reinhardt, Tim Mattson

Abstract: Experience from seven decades of classical computing suggests that a sustainable computer industry depends on a community of software engineers writing programs to address a wide variety of specific end-user needs, achieving both performance and utility in the process. Quantum computing is an emerging technology, and we do not yet have the insight to understand what quantum software tools and prac… ▽ More Experience from seven decades of classical computing suggests that a sustainable computer industry depends on a community of software engineers writing programs to address a wide variety of specific end-user needs, achieving both performance and utility in the process. Quantum computing is an emerging technology, and we do not yet have the insight to understand what quantum software tools and practices will best support researchers, software engineers, or applications specialists. Developers for today's quantum computers are grappling with the low-level details of the hardware, and progress towards scalable devices does not yet suggest what higher-level abstractions may look like. In this paper, we analyze and reframe the current state of the quantum software stack using the language of programming models. We propose an abstraction hierarchy to support quantum software engineering and discuss the consequences of overlaps across the programming, execution, and hardware models found in current technologies. We exercise this hierarchy for solving the eigenvalue estimation problem in two ways (a variational algorithm with error mitigation, and phase estimation with error correction) and pinpoint key differences in these approaches in terms of these layered models and their overlaps. While our work points to concrete conceptual challenges and gaps in quantum programming and proposes some specific steps forward, our primary thesis is that progress hinges on thinking about the abstraction hierarchy holistically, and not just about its components. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 11 pages, 3 figures. Submitted to IEEE QCE 24

arXiv:2402.09126 [pdf, other]

MPIrigen: MPI Code Generation through Domain-Specific Language Models

Authors: Nadav Schneider, Niranjan Hasabnis, Vy A. Vo, Tal Kadosh, Neva Krien, Mihai Capotă, Guy Tamir, Ted Willke, Nesreen Ahmed, Yuval Pinter, Timothy Mattson, Gal Oren

Abstract: The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generati… ▽ More The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generating MPI-based parallel programs. Findings reveal that widely used models such as GPT-3.5 and PolyCoder (specialized multi-lingual code models) exhibit notable performance degradation, when generating MPI-based programs compared to general-purpose programs. In contrast, domain-specific models such as MonoCoder, which are pretrained on MPI-related programming languages of C and C++, outperform larger models. Subsequently, we introduce a dedicated downstream task of MPI-based program generation by fine-tuning MonoCoder on HPCorpusMPI. We call the resulting model as MPIrigen. We propose an innovative preprocessing for completion only after observing the whole code, thus enabling better completion with a wider context. Comparative analysis against GPT-3.5 zero-shot performance, using a novel HPC-oriented evaluation method, demonstrates that MPIrigen excels in generating accurate MPI functions up to 0.8 accuracy in location and function predictions, and with more than 0.9 accuracy for argument predictions. The success of this tailored solution underscores the importance of domain-specific fine-tuning in optimizing language models for parallel computing code generation, paving the way for a new generation of automatic parallelization tools. The sources of this work are available at our GitHub MPIrigen repository: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rigen △ Less

Submitted 23 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.02018 [pdf, other]

The Landscape and Challenges of HPC Research and LLMs

Authors: Le Chen, Nesreen K. Ahmed, Akash Dutta, Arijit Bhattacharjee, Sixing Yu, Quazi Ishtiaque Mahmud, Waqwoya Abebe, Hung Phan, Aishwarya Sarkar, Branden Butler, Niranjan Hasabnis, Gal Oren, Vy A. Vo, Juan Pablo Munoz, Theodore L. Willke, Tim Mattson, Ali Jannesari

Abstract: Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breach… ▽ More Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks. △ Less

Submitted 6 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

arXiv:2312.13322 [pdf, other]

Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks

Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Mihai Capota, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

Abstract: With easier access to powerful compute resources, there is a growing trend in AI for software development to develop larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size and demand expensive compute resources for training. This is partly because these LLMs for HPC tasks are obtained by… ▽ More With easier access to powerful compute resources, there is a growing trend in AI for software development to develop larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size and demand expensive compute resources for training. This is partly because these LLMs for HPC tasks are obtained by finetuning existing LLMs that support several natural and/or programming languages. We found this design choice confusing - why do we need large LMs trained on natural languages and programming languages unrelated to HPC for HPC-specific tasks? In this line of work, we aim to question choices made by existing LLMs by develo** smaller LMs for specific domains - we call them domain-specific LMs. Specifically, we start off with HPC as a domain and build an HPC-specific LM, named MonoCoder, that is orders of magnitude smaller than existing LMs but delivers similar, if not better performance, on non-HPC and HPC tasks. Specifically, we pre-trained MonoCoder on an HPC-specific dataset (named HPCorpus) of C and C++ programs mined from GitHub. We evaluated the performance of MonoCoder against conventional multi-lingual LLMs. Results demonstrate that MonoCoder, although much smaller than existing LMs, achieves similar results on normalized-perplexity tests and much better ones in CodeBLEU competence for high-performance and parallel code generations. Furthermore, fine-tuning the base model for the specific task of parallel code generation (OpenMP parallel for pragmas) demonstrates outstanding results compared to GPT, especially when local misleading semantics are removed by our novel pre-processor Tokompiler, showcasing the ability of domain-specific models to assist in HPC-relevant tasks. △ Less

Submitted 20 December, 2023; originally announced December 2023.

arXiv:2308.09440 [pdf, other]

Scope is all you need: Transforming LLMs for HPC Code

Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

Abstract: With easier access to powerful compute resources, there is a growing trend in the field of AI for software development to develop larger and larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size (e.g., billions of parameters) and demand expensive compute resources for training. We found… ▽ More With easier access to powerful compute resources, there is a growing trend in the field of AI for software development to develop larger and larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size (e.g., billions of parameters) and demand expensive compute resources for training. We found this design choice confusing - why do we need large LLMs trained on natural languages and programming languages unrelated to HPC for HPC-specific tasks? In this line of work, we aim to question design choices made by existing LLMs by develo** smaller LLMs for specific domains - we call them domain-specific LLMs. Specifically, we start off with HPC as a domain and propose a novel tokenizer named Tokompiler, designed specifically for preprocessing code in HPC and compilation-centric tasks. Tokompiler leverages knowledge of language primitives to generate language-oriented tokens, providing a context-aware understanding of code structure while avoiding human semantics attributed to code structures completely. We applied Tokompiler to pre-train two state-of-the-art models, SPT-Code and Polycoder, for a Fortran code corpus mined from GitHub. We evaluate the performance of these models against the conventional LLMs. Results demonstrate that Tokompiler significantly enhances code completion accuracy and semantic understanding compared to traditional tokenizers in normalized-perplexity tests, down to ~1 perplexity score. This research opens avenues for further advancements in domain-specific LLMs, catering to the unique demands of HPC and compilation tasks. △ Less

Submitted 29 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

arXiv:2308.08002 [pdf, ps, other]

Quantifying OpenMP: Statistical Insights into Usage and Adoption

Authors: Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

Abstract: In high-performance computing (HPC), the demand for efficient parallel programming models has grown dramatically since the end of Dennard Scaling and the subsequent move to multi-core CPUs. OpenMP stands out as a popular choice due to its simplicity and portability, offering a directive-driven approach for shared-memory parallel programming. Despite its wide adoption, however, there is a lack of c… ▽ More In high-performance computing (HPC), the demand for efficient parallel programming models has grown dramatically since the end of Dennard Scaling and the subsequent move to multi-core CPUs. OpenMP stands out as a popular choice due to its simplicity and portability, offering a directive-driven approach for shared-memory parallel programming. Despite its wide adoption, however, there is a lack of comprehensive data on the actual usage of OpenMP constructs, hindering unbiased insights into its popularity and evolution. This paper presents a statistical analysis of OpenMP usage and adoption trends based on a novel and extensive database, HPCORPUS, compiled from GitHub repositories containing C, C++, and Fortran code. The results reveal that OpenMP is the dominant parallel programming model, accounting for 45% of all analyzed parallel APIs. Furthermore, it has demonstrated steady and continuous growth in popularity over the past decade. Analyzing specific OpenMP constructs, the study provides in-depth insights into their usage patterns and preferences across the three languages. Notably, we found that while OpenMP has a strong "common core" of constructs in common usage (while the rest of the API is less used), there are new adoption trends as well, such as simd and target directives for accelerated computing and task for irregular parallelism. Overall, this study sheds light on OpenMP's significance in HPC applications and provides valuable data for researchers and practitioners. It showcases OpenMP's versatility, evolving adoption, and relevance in contemporary parallel programming, underlining its continued role in HPC applications and beyond. These statistical insights are essential for making informed decisions about parallelization strategies and provide a foundation for further advancements in parallel programming models and techniques. △ Less

Submitted 17 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

arXiv:2305.11999 [pdf, other]

Advising OpenMP Parallelization via a Graph-Based Approach with Transformers

Authors: Tal Kadosh, Nadav Schneider, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

Abstract: There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many deterministic source-to-source (S2S) compilers have emerged, intending to automate the process of translat… ▽ More There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many deterministic source-to-source (S2S) compilers have emerged, intending to automate the process of translating serial to parallel code. However, recent studies have shown that these compilers are impractical in many scenarios. In this work, we combine the latest advancements in the field of AI and natural language processing (NLP) with the vast amount of open-source code to address the problem of automatic parallelization. Specifically, we propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared-memory attributes in parallel code, given its serial version. OMPify is based on a Transformer-based model that leverages a graph-based representation of source code that exploits the inherent structure of code. We evaluated our tool by predicting the parallelization pragmas and attributes of a large corpus of (over 54,000) snippets of serial code written in C and C++ languages (Open-OMP-Plus). Our results demonstrate that OMPify outperforms existing approaches, the general-purposed and popular ChatGPT and targeted PragFormer models, in terms of F1 score and accuracy. Specifically, OMPify achieves up to 90% accuracy on commonly-used OpenMP benchmark tests such as NAS, SPEC, and PolyBench. Additionally, we performed an ablation study to assess the impact of different model components and present interesting insights derived from the study. Lastly, we also explored the potential of using data augmentation and curriculum learning techniques to improve the model's robustness and generalization capabilities. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2305.09438 [pdf, other]

MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with Transformers

Authors: Nadav Schneider, Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

Abstract: Message Passing Interface (MPI) plays a crucial role in distributed memory parallelization across multiple nodes. However, parallelizing MPI code manually, and specifically, performing domain decomposition, is a challenging, error-prone task. In this paper, we address this problem by develo** MPI-RICAL, a novel data-driven, programming-assistance tool that assists programmers in writing domain d… ▽ More Message Passing Interface (MPI) plays a crucial role in distributed memory parallelization across multiple nodes. However, parallelizing MPI code manually, and specifically, performing domain decomposition, is a challenging, error-prone task. In this paper, we address this problem by develo** MPI-RICAL, a novel data-driven, programming-assistance tool that assists programmers in writing domain decomposition based distributed memory parallelization code. Specifically, we train a supervised language model to suggest MPI functions and their proper locations in the code on the fly. We also introduce MPICodeCorpus, the first publicly available corpus of MPI-based parallel programs that is created by mining more than 15,000 open-source repositories on GitHub. Experimental results have been done on MPICodeCorpus and more importantly, on a compiled benchmark of MPI-based parallel programs for numerical computations that represent real-world scientific applications. MPI-RICAL achieves F1 scores between 0.87-0.91 on these programs, demonstrating its accuracy in suggesting correct MPI functions at appropriate code locations.. The source code used in this work, as well as other relevant sources, are available at: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rical △ Less

Submitted 30 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

arXiv:2211.00844 [pdf, ps, other]

Introducing the Quantum Research Kernels: Lessons from Classical Parallel Computing

Authors: A. Y. Matsuura, Timothy G. Mattson

Abstract: Quantum computing represents a paradigm shift for computation requiring an entirely new computer architecture. However, there is much that can be learned from traditional classical computer engineering. In this paper, we describe the Parallel Research Kernels (PRK), a tool that was very useful for designing classical parallel computing systems. The PRK are simple kernels written to expose bottlene… ▽ More Quantum computing represents a paradigm shift for computation requiring an entirely new computer architecture. However, there is much that can be learned from traditional classical computer engineering. In this paper, we describe the Parallel Research Kernels (PRK), a tool that was very useful for designing classical parallel computing systems. The PRK are simple kernels written to expose bottlenecks that limit classical parallel computing performance. We hypothesize that an analogous tool for quantum computing, Quantum Research Kernels (QRK), may similarly aid the co-design of software and hardware for quantum computing systems, and we give a few examples of representative QRKs. △ Less

Submitted 1 November, 2022; originally announced November 2022.

Comments: 2 pages

arXiv:2112.03235 [pdf, other]

Simulation Intelligence: Towards a New Generation of Scientific Methods

Authors: Alexander Lavin, David Krakauer, Hector Zenil, Justin Gottschlich, Tim Mattson, Johann Brehmer, Anima Anandkumar, Sanjay Choudry, Kamil Rocki, Atılım Güneş Baydin, Carina Prunkl, Brooks Paige, Olexandr Isayev, Erik Peterson, Peter L. McMahon, Jakob Macke, Kyle Cranmer, Jiaxin Zhang, Haruko Wainwright, Adi Hanuka, Manuela Veloso, Samuel Assefa, Stephan Zheng, Avi Pfeffer

Abstract: The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simul… ▽ More The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science. △ Less

Submitted 27 November, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

arXiv:2104.01661 [pdf, ps, other]

LAGraph: Linear Algebra, Network Analysis Libraries, and the Study of Graph Algorithms

Authors: Gábor Szárnyas, David A. Bader, Timothy A. Davis, James Kitchen, Timothy G. Mattson, Scott McMillan, Erik Welch

Abstract: Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the… ▽ More Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the library, and performance using the GAP benchmark suite. LAGraph, however, is much more than a library. It is also a project to document and analyze the full range of algorithms enabled by the GraphBLAS. To that end, we have developed a compact and intuitive notation for describing these algorithms. In this paper, we present that notation with examples from the GAP benchmark suite. △ Less

Submitted 4 April, 2021; originally announced April 2021.

Comments: Accepted to GrAPL 2021

arXiv:2006.05265 [pdf, other]

MISIM: A Neural Code Semantics Similarity System Using the Context-Aware Semantics Structure

Authors: Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Niranjan Hasabnis, Paul Petersen, Timothy Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

Abstract: Code semantics similarity can be used for many tasks such as code recommendation, automated software defect correction, and clone detection. Yet, the accuracy of such systems has not yet reached a level of general purpose reliability. To help address this, we present Machine Inferred Code Similarity (MISIM), a neural code semantics similarity system consisting of two core components: (i)MISIM uses… ▽ More Code semantics similarity can be used for many tasks such as code recommendation, automated software defect correction, and clone detection. Yet, the accuracy of such systems has not yet reached a level of general purpose reliability. To help address this, we present Machine Inferred Code Similarity (MISIM), a neural code semantics similarity system consisting of two core components: (i)MISIM uses a novel context-aware semantics structure, which was purpose-built to lift semantics from code syntax; (ii)MISIM uses an extensible neural code similarity scoring algorithm, which can be used for various neural network architectures with learned parameters. We compare MISIM to four state-of-the-art systems, including two additional hand-customized models, over 328K programs consisting of over 18 million lines of code. Our experiments show that MISIM has 8.08% better accuracy (using MAP@R) compared to the next best performing system. △ Less

Submitted 2 June, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: arXiv admin note: text overlap with arXiv:2003.11118

arXiv:2003.11118 [pdf, ps, other]

Context-Aware Parse Trees

Authors: Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Paul Petersen, Jesmin Jahan Tithi, Tim Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

Abstract: The simplified parse tree (SPT) presented in Aroma, a state-of-the-art code recommendation system, is a tree-structured representation used to infer code semantics by capturing program \emph{structure} rather than program \emph{syntax}. This is a departure from the classical abstract syntax tree, which is principally driven by programming language syntax. While we believe a semantics-driven repres… ▽ More The simplified parse tree (SPT) presented in Aroma, a state-of-the-art code recommendation system, is a tree-structured representation used to infer code semantics by capturing program \emph{structure} rather than program \emph{syntax}. This is a departure from the classical abstract syntax tree, which is principally driven by programming language syntax. While we believe a semantics-driven representation is desirable, the specifics of an SPT's construction can impact its performance. We analyze these nuances and present a new tree structure, heavily influenced by Aroma's SPT, called a \emph{context-aware parse tree} (CAPT). CAPT enhances SPT by providing a richer level of semantic representation. Specifically, CAPT provides additional binding support for language-specific techniques for adding semantically-salient features, and language-agnostic techniques for removing syntactically-present but semantically-irrelevant features. Our research quantitatively demonstrates the value of our proposed semantically-salient features, enabling a specific CAPT configuration to be 39\% more accurate than SPT across the 48,610 programs we analyzed. △ Less

Submitted 24 March, 2020; originally announced March 2020.

arXiv:1806.05909 [pdf]

Planet Size Distribution from the Kepler Mission and its Implications for Planet Formation

Authors: Li Zeng, Stein B. Jacobsen, Eugenia Hyung, Andrew Vanderburg, Mercedes Lopez-Morales, Dimitar D. Sasselov, Juan Perez-Mercader, Michail I. Petaev, David W. Latham, Raphaëlle D. Haywood, Thomas K. R. Mattson

Abstract: The size distribution of exoplanets is a bimodal division into two groups: Rocky planet (<2 Earth radii) and water-rich planet (>2 Earth radii) with or without gaseous envelope. The size distribution of exoplanets is a bimodal division into two groups: Rocky planet (<2 Earth radii) and water-rich planet (>2 Earth radii) with or without gaseous envelope. △ Less

Submitted 15 June, 2018; originally announced June 2018.

Comments: 2 pages, 4 figures, abstract publicly available since January 2017 (see https://www.hou.usra.edu/meetings/lpsc2017/pdf/1576.pdf) and presented at the 48th LPSC on March 23, 2017. Also See: http://adsabs.harvard.edu/abs/2017LPI....48.1576Z

arXiv:1803.07244 [pdf, other]

The Three Pillars of Machine Programming

Authors: Justin Gottschlich, Armando Solar-Lezama, Nesime Tatbul, Michael Carbin, Martin Rinard, Regina Barzilay, Saman Amarasinghe, Joshua B Tenenbaum, Tim Mattson

Abstract: In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are: (i) intention, (ii) invention, and(iii) adaptation. Intention emphasizes advancements in the human-to-computer and computer-to-machine-learning interfaces. Invention emphasizes the creation or refinement of algorithms or core hardwar… ▽ More In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are: (i) intention, (ii) invention, and(iii) adaptation. Intention emphasizes advancements in the human-to-computer and computer-to-machine-learning interfaces. Invention emphasizes the creation or refinement of algorithms or core hardware and software building blocks through machine learning (ML). Adaptation emphasizes advances in the use of ML-based constructs to autonomously evolve software. △ Less

Submitted 26 June, 2021; v1 submitted 19 March, 2018; originally announced March 2018.

arXiv:1709.07536 [pdf, other]

A Zero-Positive Learning Approach for Diagnosing Software Performance Regressions

Authors: Mejbah Alam, Justin Gottschlich, Nesime Tatbul, Javier Turek, Timothy Mattson, Abdullah Muzahid

Abstract: The field of machine programming (MP), the automation of the development of software, is making notable research advances. This is, in part, due to the emergence of a wide range of novel techniques in machine learning. In this paper, we apply MP to the automation of software performance regression testing. A performance regression is a software performance degradation caused by a code change. We p… ▽ More The field of machine programming (MP), the automation of the development of software, is making notable research advances. This is, in part, due to the emergence of a wide range of novel techniques in machine learning. In this paper, we apply MP to the automation of software performance regression testing. A performance regression is a software performance degradation caused by a code change. We present AutoPerf - a novel approach to automate regression testing that utilizes three core techniques: (i) zero-positive learning, (ii) autoencoders, and (iii) hardware telemetry. We demonstrate AutoPerf's generality and efficacy against 3 types of performance regressions across 10 real performance bugs in 7 benchmark and open-source programs. On average, AutoPerf exhibits 4% profiling overhead and accurately diagnoses more performance bugs than prior state-of-the-art approaches. Thus far, AutoPerf has produced no false negatives. △ Less

Submitted 1 January, 2020; v1 submitted 21 September, 2017; originally announced September 2017.

arXiv:1707.00721 [pdf, other]

doi 10.1109/HPEC.2017.8091077

Version 0.1 of the BigDAWG Polystore System

Authors: Vijay Gadepally, Kyle OBrien, Adam Dziedzic, Aaron Elmore, Jeremy Kepner, Samuel Madden, Tim Mattson, Jennie Rogers, Zuohao She, Michael Stonebraker

Abstract: A polystore system is a database management system (DBMS) composed of integrated heterogeneous database engines and multiple programming languages. By matching data to the storage engine best suited to its needs, complex analytics run faster and flexible storage choices helps improve data organization. BigDAWG (Big Data Working Group) is our reference implementation of a polystore system. In this… ▽ More A polystore system is a database management system (DBMS) composed of integrated heterogeneous database engines and multiple programming languages. By matching data to the storage engine best suited to its needs, complex analytics run faster and flexible storage choices helps improve data organization. BigDAWG (Big Data Working Group) is our reference implementation of a polystore system. In this paper, we describe the current BigDAWG software release which supports PostgreSQL, Accumulo and SciDB. We describe the overall architecture, API and initial results of applying BigDAWG to the MIMIC II medical dataset. △ Less

Submitted 3 July, 2017; originally announced July 2017.

Comments: Accepted to IEEE HPEC 2017

arXiv:1701.05799 [pdf]

BigDAWG Polystore Release and Demonstration

Authors: Kyle OBrien, Vijay Gadepally, Jennie Duggan, Adam Dziedzic, Aaron Elmore, Jeremy Kepner, Samuel Madden, Tim Mattson, Zuohao She, Michael Stonebraker

Abstract: The Intel Science and Technology Center for Big Data is develo** a reference implementation of a Polystore database. The BigDAWG (Big Data Working Group) system supports "many sizes" of database engines, multiple programming languages and complex analytics for a variety of workloads. Our recent efforts include application of BigDAWG to an ocean metagenomics problem and containerization of BigDAW… ▽ More The Intel Science and Technology Center for Big Data is develo** a reference implementation of a Polystore database. The BigDAWG (Big Data Working Group) system supports "many sizes" of database engines, multiple programming languages and complex analytics for a variety of workloads. Our recent efforts include application of BigDAWG to an ocean metagenomics problem and containerization of BigDAWG. We intend to release an open source BigDAWG v1.0 in the Spring of 2017. In this article, we will demonstrate a number of polystore applications developed with oceanographic researchers at MIT and describe our forthcoming open source release of the BigDAWG system. △ Less

Submitted 18 January, 2017; originally announced January 2017.

arXiv:1609.07548 [pdf, other]

doi 10.1109/HPEC.2016.7761636

The BigDAWG Polystore System and Architecture

Authors: Vijay Gadepally, Peinan Chen, Jennie Duggan, Aaron Elmore, Brandon Haynes, Jeremy Kepner, Samuel Madden, Tim Mattson, Michael Stonebraker

Abstract: Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficien… ▽ More Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficiency effects. As a part of the Intel Science and Technology Center on Big Data, we are develo** a polystore system designed for such problems. BigDAWG (short for the Big Data Analytics Working Group) is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands and a middleware that provides a uniform multi--island interface. Initial results from a prototype of the BigDAWG system applied to a medical dataset validate polystore concepts. In this article, we will describe polystore databases, the current BigDAWG architecture and its application on the MIMIC II medical dataset, initial performance results and our future development plans. △ Less

Submitted 23 September, 2016; originally announced September 2016.

Comments: 6 pages, 5 figures, IEEE High Performance Extreme Computing (HPEC) conference 2016

arXiv:1606.05797 [pdf]

doi 10.1109/HPEC.2016.7761647

Associative Array Model of SQL, NoSQL, and NewSQL Databases

Authors: Jeremy Kepner, Vijay Gadepally, Dylan Hutchison, Hayden Jananthan, Timothy Mattson, Siddharth Samsi, Albert Reuther

Abstract: The success of SQL, NoSQL, and NewSQL databases is a reflection of their ability to provide significant functionality and performance benefits for specific domains, such as financial transactions, internet search, and data analysis. The BigDAWG polystore seeks to provide a mechanism to allow applications to transparently achieve the benefits of diverse databases while insulating applications from… ▽ More The success of SQL, NoSQL, and NewSQL databases is a reflection of their ability to provide significant functionality and performance benefits for specific domains, such as financial transactions, internet search, and data analysis. The BigDAWG polystore seeks to provide a mechanism to allow applications to transparently achieve the benefits of diverse databases while insulating applications from the details of these databases. Associative arrays provide a common approach to the mathematics found in different databases: sets (SQL), graphs (NoSQL), and matrices (NewSQL). This work presents the SQL relational model in terms of associative arrays and identifies the key mathematical properties that are preserved within SQL. These properties include associativity, commutativity, distributivity, identities, annihilators, and inverses. Performance measurements on distributivity and associativity show the impact these properties can have on associative array operations. These results demonstrate that associative arrays could provide a mathematical model for polystores to optimize the exchange of data and execution queries. △ Less

Submitted 18 June, 2016; originally announced June 2016.

Comments: 9 pages; 6 figures; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2016

arXiv:1606.05790 [pdf, other]

doi 10.1109/HPEC.2016.7761646

Mathematical Foundations of the GraphBLAS

Authors: Jeremy Kepner, Peter Aaltonen, David Bader, Aydın Buluc, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, Scott McMillan, Jose Moreira, John D. Owens, Carl Yang, Marcin Zalewski, Timothy Mattson

Abstract: The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. Mathematically the Graph- BLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the mathematics of th… ▽ More The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. Mathematically the Graph- BLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the mathematics of the GraphBLAS. Graphs represent connections between vertices with edges. Matrices can represent a wide range of graphs using adjacency matrices or incidence matrices. Adjacency matrices are often easier to analyze while incidence matrices are often better for representing data. Fortunately, the two are easily connected by matrix mul- tiplication. A key feature of matrix mathematics is that a very small number of matrix operations can be used to manipulate a very wide range of graphs. This composability of small number of operations is the foundation of the GraphBLAS. A standard such as the GraphBLAS can only be effective if it has low performance overhead. Performance measurements of prototype GraphBLAS implementations indicate that the overhead is low. △ Less

Submitted 13 July, 2016; v1 submitted 18 June, 2016; originally announced June 2016.

Comments: 9 pages; 11 figures; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2016. arXiv admin note: text overlap with arXiv:1504.01039

arXiv:1602.08791 [pdf, other]

The BigDAWG Architecture

Authors: Vijay Gadepally, Jennie Duggan, Aaron Elmore, Jeremy Kepner, Samuel Madden, Tim Mattson, Michael Stonebraker

Abstract: BigDAWG is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands of information and a middleware that provides a uniform mult… ▽ More BigDAWG is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands of information and a middleware that provides a uniform multi-island interface. In this article, we describe the current architecture of BigDAWG, its application on the MIMIC II medical dataset, and our plans for the mechanics of cross-system queries. During the presentation, we will also deliver a brief demonstration of the current version of BigDAWG. △ Less

Submitted 28 February, 2016; originally announced February 2016.

arXiv:1504.01039 [pdf]

doi 10.1016/j.procs.2015.05.353

Graphs, Matrices, and the GraphBLAS: Seven Good Reasons

Authors: Jeremy Kepner, David Bader, Aydın Buluc, John Gilbert, Timothy Mattson, Henning Meyerhenke

Abstract: The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a… ▽ More The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a number of promising solutions to these challenges. The GraphBLAS standard (istc-bigdata.org/GraphBlas) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. The GraphBLAS mathematically defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the GraphBLAS and describes how the GraphBLAS can be used to address many of the challenges associated with analysis of graphs. △ Less

Submitted 16 September, 2023; v1 submitted 4 April, 2015; originally announced April 2015.

Comments: 10 pages; International Conference on Computational Science workshop on the Applications of Matrix Computational Methods in the Analysis of Modern Data

Journal ref: Procedia Computer Science Volume 51, 2015, Pages 2453-2462, International Conference On Computational Science

arXiv:1408.0393 [pdf]

doi 10.1109/HPEC.2013.6670338

Standards for Graph Algorithm Primitives

Authors: Tim Mattson, David Bader, Jon Berry, Aydin Buluc, Jack Dongarra, Christos Faloutsos, John Feo, John Gilbert, Joseph Gonzalez, Bruce Hendrickson, Jeremy Kepner, Charles Leiserson, Andrew Lumsdaine, David Padua, Stephen Poole, Steve Reinhardt, Mike Stonebraker, Steve Wallach, Andrew Yoo

Abstract: It is our view that the state of the art in constructing a large collection of graph algorithms in terms of linear algebraic operations is mature enough to support the emergence of a standard set of primitive building blocks. This paper is a position paper defining the problem and announcing our intention to launch an open effort to define this standard. It is our view that the state of the art in constructing a large collection of graph algorithms in terms of linear algebraic operations is mature enough to support the emergence of a standard set of primitive building blocks. This paper is a position paper defining the problem and announcing our intention to launch an open effort to define this standard. △ Less

Submitted 2 August, 2014; originally announced August 2014.

Comments: 2 pages, IEEE HPEC 2013

Showing 1–25 of 25 results for author: Mattson, T