Skip to main content

Showing 1–25 of 25 results for author: Mattson, T

.
  1. Distributed Ranges: A Model for Distributed Data Structures, Algorithms, and Views

    Authors: Benjamin Brock, Robert Cohn, Suyash Bakshi, Tuomas Karna, Jeongnim Kim, Mateusz Nowak, Łukasz Ślusarczyk, Kacper Stefanski, Timothy G. Mattson

    Abstract: Data structures and algorithms are essential building blocks for programs, and \emph{distributed data structures}, which automatically partition data across multiple memory locales, are essential to writing high-level parallel programs. While many projects have designed and implemented C++ distributed data structures and algorithms, there has not been widespread adoption of an interoperable model… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: To appear in ACM International Conference on Supercomputing (ICS) 2024

    Journal ref: In Proceedings of the 38th ACM International Conference on Supercomputing (ICS 2024) 236-246

  2. arXiv:2405.13918  [pdf, other

    quant-ph

    An Abstraction Hierarchy Toward Productive Quantum Programming

    Authors: Olivia Di Matteo, Santiago Núñez-Corrales, Michał Stęchły, Steven P. Reinhardt, Tim Mattson

    Abstract: Experience from seven decades of classical computing suggests that a sustainable computer industry depends on a community of software engineers writing programs to address a wide variety of specific end-user needs, achieving both performance and utility in the process. Quantum computing is an emerging technology, and we do not yet have the insight to understand what quantum software tools and prac… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 11 pages, 3 figures. Submitted to IEEE QCE 24

  3. arXiv:2402.09126  [pdf, other

    cs.DC cs.AI cs.CL cs.LG cs.SE

    MPIrigen: MPI Code Generation through Domain-Specific Language Models

    Authors: Nadav Schneider, Niranjan Hasabnis, Vy A. Vo, Tal Kadosh, Neva Krien, Mihai Capotă, Guy Tamir, Ted Willke, Nesreen Ahmed, Yuval Pinter, Timothy Mattson, Gal Oren

    Abstract: The imperative need to scale computation across numerous nodes highlights the significance of efficient parallel computing, particularly in the realm of Message Passing Interface (MPI) integration. The challenging parallel programming task of generating MPI-based parallel programs has remained unexplored. This study first investigates the performance of state-of-the-art language models in generati… ▽ More

    Submitted 23 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  4. arXiv:2402.02018  [pdf, other

    cs.LG

    The Landscape and Challenges of HPC Research and LLMs

    Authors: Le Chen, Nesreen K. Ahmed, Akash Dutta, Arijit Bhattacharjee, Sixing Yu, Quazi Ishtiaque Mahmud, Waqwoya Abebe, Hung Phan, Aishwarya Sarkar, Branden Butler, Niranjan Hasabnis, Gal Oren, Vy A. Vo, Juan Pablo Munoz, Theodore L. Willke, Tim Mattson, Ali Jannesari

    Abstract: Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breach… ▽ More

    Submitted 6 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  5. arXiv:2312.13322  [pdf, other

    cs.PL cs.AI cs.LG cs.SE

    Domain-Specific Code Language Models: Unraveling the Potential for HPC Codes and Tasks

    Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Mihai Capota, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

    Abstract: With easier access to powerful compute resources, there is a growing trend in AI for software development to develop larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size and demand expensive compute resources for training. This is partly because these LLMs for HPC tasks are obtained by… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  6. arXiv:2308.09440  [pdf, other

    cs.CL cs.PL

    Scope is all you need: Transforming LLMs for HPC Code

    Authors: Tal Kadosh, Niranjan Hasabnis, Vy A. Vo, Nadav Schneider, Neva Krien, Abdul Wasay, Nesreen Ahmed, Ted Willke, Guy Tamir, Yuval Pinter, Timothy Mattson, Gal Oren

    Abstract: With easier access to powerful compute resources, there is a growing trend in the field of AI for software development to develop larger and larger language models (LLMs) to address a variety of programming tasks. Even LLMs applied to tasks from the high-performance computing (HPC) domain are huge in size (e.g., billions of parameters) and demand expensive compute resources for training. We found… ▽ More

    Submitted 29 September, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

  7. arXiv:2308.08002  [pdf, ps, other

    cs.DC cs.DB

    Quantifying OpenMP: Statistical Insights into Usage and Adoption

    Authors: Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

    Abstract: In high-performance computing (HPC), the demand for efficient parallel programming models has grown dramatically since the end of Dennard Scaling and the subsequent move to multi-core CPUs. OpenMP stands out as a popular choice due to its simplicity and portability, offering a directive-driven approach for shared-memory parallel programming. Despite its wide adoption, however, there is a lack of c… ▽ More

    Submitted 17 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

  8. arXiv:2305.11999  [pdf, other

    cs.DC cs.AI cs.LG cs.PF

    Advising OpenMP Parallelization via a Graph-Based Approach with Transformers

    Authors: Tal Kadosh, Nadav Schneider, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

    Abstract: There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many deterministic source-to-source (S2S) compilers have emerged, intending to automate the process of translat… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  9. arXiv:2305.09438  [pdf, other

    cs.DC cs.CL cs.LG

    MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with Transformers

    Authors: Nadav Schneider, Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

    Abstract: Message Passing Interface (MPI) plays a crucial role in distributed memory parallelization across multiple nodes. However, parallelizing MPI code manually, and specifically, performing domain decomposition, is a challenging, error-prone task. In this paper, we address this problem by develo** MPI-RICAL, a novel data-driven, programming-assistance tool that assists programmers in writing domain d… ▽ More

    Submitted 30 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

  10. arXiv:2211.00844  [pdf, ps, other

    quant-ph cs.DC

    Introducing the Quantum Research Kernels: Lessons from Classical Parallel Computing

    Authors: A. Y. Matsuura, Timothy G. Mattson

    Abstract: Quantum computing represents a paradigm shift for computation requiring an entirely new computer architecture. However, there is much that can be learned from traditional classical computer engineering. In this paper, we describe the Parallel Research Kernels (PRK), a tool that was very useful for designing classical parallel computing systems. The PRK are simple kernels written to expose bottlene… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: 2 pages

  11. arXiv:2112.03235  [pdf, other

    cs.AI cs.CE cs.LG cs.MS

    Simulation Intelligence: Towards a New Generation of Scientific Methods

    Authors: Alexander Lavin, David Krakauer, Hector Zenil, Justin Gottschlich, Tim Mattson, Johann Brehmer, Anima Anandkumar, Sanjay Choudry, Kamil Rocki, Atılım Güneş Baydin, Carina Prunkl, Brooks Paige, Olexandr Isayev, Erik Peterson, Peter L. McMahon, Jakob Macke, Kyle Cranmer, Jiaxin Zhang, Haruko Wainwright, Adi Hanuka, Manuela Veloso, Samuel Assefa, Stephan Zheng, Avi Pfeffer

    Abstract: The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simul… ▽ More

    Submitted 27 November, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

  12. arXiv:2104.01661  [pdf, ps, other

    cs.MS cs.DS

    LAGraph: Linear Algebra, Network Analysis Libraries, and the Study of Graph Algorithms

    Authors: Gábor Szárnyas, David A. Bader, Timothy A. Davis, James Kitchen, Timothy G. Mattson, Scott McMillan, Erik Welch

    Abstract: Graph algorithms can be expressed in terms of linear algebra. GraphBLAS is a library of low-level building blocks for such algorithms that targets algorithm developers. LAGraph builds on top of the GraphBLAS to target users of graph algorithms with high-level algorithms common in network analysis. In this paper, we describe the first release of the LAGraph library, the design decisions behind the… ▽ More

    Submitted 4 April, 2021; originally announced April 2021.

    Comments: Accepted to GrAPL 2021

  13. arXiv:2006.05265  [pdf, other

    cs.LG cs.SE stat.ML

    MISIM: A Neural Code Semantics Similarity System Using the Context-Aware Semantics Structure

    Authors: Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Niranjan Hasabnis, Paul Petersen, Timothy Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

    Abstract: Code semantics similarity can be used for many tasks such as code recommendation, automated software defect correction, and clone detection. Yet, the accuracy of such systems has not yet reached a level of general purpose reliability. To help address this, we present Machine Inferred Code Similarity (MISIM), a neural code semantics similarity system consisting of two core components: (i)MISIM uses… ▽ More

    Submitted 2 June, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: arXiv admin note: text overlap with arXiv:2003.11118

  14. arXiv:2003.11118  [pdf, ps, other

    cs.PL cs.AI

    Context-Aware Parse Trees

    Authors: Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Paul Petersen, Jesmin Jahan Tithi, Tim Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich

    Abstract: The simplified parse tree (SPT) presented in Aroma, a state-of-the-art code recommendation system, is a tree-structured representation used to infer code semantics by capturing program \emph{structure} rather than program \emph{syntax}. This is a departure from the classical abstract syntax tree, which is principally driven by programming language syntax. While we believe a semantics-driven repres… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

  15. arXiv:1806.05909  [pdf

    astro-ph.EP

    Planet Size Distribution from the Kepler Mission and its Implications for Planet Formation

    Authors: Li Zeng, Stein B. Jacobsen, Eugenia Hyung, Andrew Vanderburg, Mercedes Lopez-Morales, Dimitar D. Sasselov, Juan Perez-Mercader, Michail I. Petaev, David W. Latham, Raphaëlle D. Haywood, Thomas K. R. Mattson

    Abstract: The size distribution of exoplanets is a bimodal division into two groups: Rocky planet (<2 Earth radii) and water-rich planet (>2 Earth radii) with or without gaseous envelope.

    Submitted 15 June, 2018; originally announced June 2018.

    Comments: 2 pages, 4 figures, abstract publicly available since January 2017 (see https://www.hou.usra.edu/meetings/lpsc2017/pdf/1576.pdf) and presented at the 48th LPSC on March 23, 2017. Also See: http://adsabs.harvard.edu/abs/2017LPI....48.1576Z

  16. arXiv:1803.07244  [pdf, other

    cs.AI cs.PL cs.SE

    The Three Pillars of Machine Programming

    Authors: Justin Gottschlich, Armando Solar-Lezama, Nesime Tatbul, Michael Carbin, Martin Rinard, Regina Barzilay, Saman Amarasinghe, Joshua B Tenenbaum, Tim Mattson

    Abstract: In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are: (i) intention, (ii) invention, and(iii) adaptation. Intention emphasizes advancements in the human-to-computer and computer-to-machine-learning interfaces. Invention emphasizes the creation or refinement of algorithms or core hardwar… ▽ More

    Submitted 26 June, 2021; v1 submitted 19 March, 2018; originally announced March 2018.

  17. arXiv:1709.07536  [pdf, other

    cs.SE cs.NE cs.PF

    A Zero-Positive Learning Approach for Diagnosing Software Performance Regressions

    Authors: Mejbah Alam, Justin Gottschlich, Nesime Tatbul, Javier Turek, Timothy Mattson, Abdullah Muzahid

    Abstract: The field of machine programming (MP), the automation of the development of software, is making notable research advances. This is, in part, due to the emergence of a wide range of novel techniques in machine learning. In this paper, we apply MP to the automation of software performance regression testing. A performance regression is a software performance degradation caused by a code change. We p… ▽ More

    Submitted 1 January, 2020; v1 submitted 21 September, 2017; originally announced September 2017.

  18. Version 0.1 of the BigDAWG Polystore System

    Authors: Vijay Gadepally, Kyle OBrien, Adam Dziedzic, Aaron Elmore, Jeremy Kepner, Samuel Madden, Tim Mattson, Jennie Rogers, Zuohao She, Michael Stonebraker

    Abstract: A polystore system is a database management system (DBMS) composed of integrated heterogeneous database engines and multiple programming languages. By matching data to the storage engine best suited to its needs, complex analytics run faster and flexible storage choices helps improve data organization. BigDAWG (Big Data Working Group) is our reference implementation of a polystore system. In this… ▽ More

    Submitted 3 July, 2017; originally announced July 2017.

    Comments: Accepted to IEEE HPEC 2017

  19. arXiv:1701.05799  [pdf

    cs.DB

    BigDAWG Polystore Release and Demonstration

    Authors: Kyle OBrien, Vijay Gadepally, Jennie Duggan, Adam Dziedzic, Aaron Elmore, Jeremy Kepner, Samuel Madden, Tim Mattson, Zuohao She, Michael Stonebraker

    Abstract: The Intel Science and Technology Center for Big Data is develo** a reference implementation of a Polystore database. The BigDAWG (Big Data Working Group) system supports "many sizes" of database engines, multiple programming languages and complex analytics for a variety of workloads. Our recent efforts include application of BigDAWG to an ocean metagenomics problem and containerization of BigDAW… ▽ More

    Submitted 18 January, 2017; originally announced January 2017.

  20. The BigDAWG Polystore System and Architecture

    Authors: Vijay Gadepally, Peinan Chen, Jennie Duggan, Aaron Elmore, Brandon Haynes, Jeremy Kepner, Samuel Madden, Tim Mattson, Michael Stonebraker

    Abstract: Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficien… ▽ More

    Submitted 23 September, 2016; originally announced September 2016.

    Comments: 6 pages, 5 figures, IEEE High Performance Extreme Computing (HPEC) conference 2016

  21. arXiv:1606.05797  [pdf

    cs.DB cs.DC cs.PL

    Associative Array Model of SQL, NoSQL, and NewSQL Databases

    Authors: Jeremy Kepner, Vijay Gadepally, Dylan Hutchison, Hayden Jananthan, Timothy Mattson, Siddharth Samsi, Albert Reuther

    Abstract: The success of SQL, NoSQL, and NewSQL databases is a reflection of their ability to provide significant functionality and performance benefits for specific domains, such as financial transactions, internet search, and data analysis. The BigDAWG polystore seeks to provide a mechanism to allow applications to transparently achieve the benefits of diverse databases while insulating applications from… ▽ More

    Submitted 18 June, 2016; originally announced June 2016.

    Comments: 9 pages; 6 figures; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2016

  22. arXiv:1606.05790  [pdf, other

    cs.MS astro-ph.IM cs.DC cs.DS

    Mathematical Foundations of the GraphBLAS

    Authors: Jeremy Kepner, Peter Aaltonen, David Bader, Aydın Buluc, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, Scott McMillan, Jose Moreira, John D. Owens, Carl Yang, Marcin Zalewski, Timothy Mattson

    Abstract: The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. Mathematically the Graph- BLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the mathematics of th… ▽ More

    Submitted 13 July, 2016; v1 submitted 18 June, 2016; originally announced June 2016.

    Comments: 9 pages; 11 figures; accepted to IEEE High Performance Extreme Computing (HPEC) conference 2016. arXiv admin note: text overlap with arXiv:1504.01039

  23. arXiv:1602.08791  [pdf, other

    cs.DB

    The BigDAWG Architecture

    Authors: Vijay Gadepally, Jennie Duggan, Aaron Elmore, Jeremy Kepner, Samuel Madden, Tim Mattson, Michael Stonebraker

    Abstract: BigDAWG is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands of information and a middleware that provides a uniform mult… ▽ More

    Submitted 28 February, 2016; originally announced February 2016.

  24. Graphs, Matrices, and the GraphBLAS: Seven Good Reasons

    Authors: Jeremy Kepner, David Bader, Aydın Buluc, John Gilbert, Timothy Mattson, Henning Meyerhenke

    Abstract: The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a… ▽ More

    Submitted 16 September, 2023; v1 submitted 4 April, 2015; originally announced April 2015.

    Comments: 10 pages; International Conference on Computational Science workshop on the Applications of Matrix Computational Methods in the Analysis of Modern Data

    Journal ref: Procedia Computer Science Volume 51, 2015, Pages 2453-2462, International Conference On Computational Science

  25. arXiv:1408.0393  [pdf

    cs.MS cs.DM cs.DS

    Standards for Graph Algorithm Primitives

    Authors: Tim Mattson, David Bader, Jon Berry, Aydin Buluc, Jack Dongarra, Christos Faloutsos, John Feo, John Gilbert, Joseph Gonzalez, Bruce Hendrickson, Jeremy Kepner, Charles Leiserson, Andrew Lumsdaine, David Padua, Stephen Poole, Steve Reinhardt, Mike Stonebraker, Steve Wallach, Andrew Yoo

    Abstract: It is our view that the state of the art in constructing a large collection of graph algorithms in terms of linear algebraic operations is mature enough to support the emergence of a standard set of primitive building blocks. This paper is a position paper defining the problem and announcing our intention to launch an open effort to define this standard.

    Submitted 2 August, 2014; originally announced August 2014.

    Comments: 2 pages, IEEE HPEC 2013