Skip to main content

Showing 1–10 of 10 results for author: von Laszewski, G

.
  1. arXiv:2403.15721  [pdf, other

    cs.DC

    Design and Implementation of an Analysis Pipeline for Heterogeneous Data

    Authors: Arup Kumar Sarker, Aymen Alsaadi, Niranda Perera, Mills Staylor, Gregor von Laszewski, Matteo Turilli, Ozgur Ozan Kilic, Mikhail Titov, Andre Merzky, Shantenu Jha, Geoffrey Fox

    Abstract: Managing and preparing complex data for deep learning, a prevalent approach in large-scale data science can be challenging. Data transfer for model training also presents difficulties, impacting scientific fields like genomics, climate modeling, and astronomy. A large-scale solution like Google Pathways with a distributed execution environment for deep learning models exists but is proprietary. In… ▽ More

    Submitted 7 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: 14 pages, 16 figures, 2 tables

    ACM Class: H.2.4; D.2.7; D.2.2

  2. arXiv:2401.08636   

    cs.DC cs.AI

    MLCommons Cloud Masking Benchmark with Early Stop**

    Authors: Varshitha Chennamsetti, Gregor von Laszewski, Ruochen Gu, Laiba Mehnaz, Juri Papay, Samuel Jackson, Jeyan Thiyagalingam, Sergey V. Samsonau, Geoffrey C. Fox

    Abstract: In this paper, we report on work performed for the MLCommons Science Working Group on the cloud masking benchmark. MLCommons is a consortium that develops and maintains several scientific benchmarks that aim to benefit developments in AI. The benchmarks are conducted on the High Performance Computing (HPC) Clusters of New York University and University of Virginia, as well as a commodity desktop.… ▽ More

    Submitted 30 May, 2024; v1 submitted 11 December, 2023; originally announced January 2024.

    Comments: NYU did not approve the publication of the paper

  3. arXiv:2312.04799  [pdf, other

    cs.DC cs.AI

    An Overview of MLCommons Cloud Mask Benchmark: Related Research and Data

    Authors: Gregor von Laszewski, Ruochen Gu

    Abstract: Cloud masking is a crucial task that is well-motivated for meteorology and its applications in environmental and atmospheric sciences. Its goal is, given satellite images, to accurately generate cloud masks that identify each pixel in image to contain either cloud or clear sky. In this paper, we summarize some of the ongoing research activities in cloud masking, with a focus on the research and be… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 13 pages, 2 tables 7 figures, 3 appendix

  4. arXiv:2310.17013  [pdf, other

    cs.DC

    Whitepaper on Reusable Hybrid and Multi-Cloud Analytics Service Framework

    Authors: Gregor von Laszewski, Wo Chang, Russell Reinsch, Olivera Kotevska, Ali Karimi, Abdul Rahman Sattar, Garry Mazzaferro, Geoffrey C. Fox

    Abstract: Over the last several years, the computation landscape for conducting data analytics has completely changed. While in the past, a lot of the activities have been undertaken in isolation by companies, and research institutions, today's infrastructure constitutes a wealth of services offered by a variety of providers that offer opportunities for reuse, and interactions while leveraging service colla… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  5. arXiv:2307.01394  [pdf, ps, other

    cs.DC cs.AI cs.IR cs.LG

    In-depth Analysis On Parallel Processing Patterns for High-Performance Dataframes

    Authors: Niranda Perera, Arup Kumar Sarker, Mills Staylor, Gregor von Laszewski, Kaiying Shan, Supun Kamburugamuve, Chathura Widanage, Vibhatha Abeykoon, Thejaka Amila Kanewela, Geoffrey Fox

    Abstract: The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more complexities to data engineering applications, which are now integrated into data processing pipelines to process terabytes of data. Typically, a significant amoun… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Report number: FGCS-D-23-00577R1

  6. arXiv:2210.16941  [pdf, other

    cs.DC

    Hybrid Reusable Computational Analytics Workflow Management with Cloudmesh

    Authors: Gregor von Laszewski, J. P. Fleischer, Geoffrey C. Fox

    Abstract: In this paper, we summarize our effort to create and utilize a simple framework to coordinate computational analytics tasks with the help of a workflow system. Our design is based on a minimalistic approach while at the same time allowing to access computational resources offered through the owner's computer, HPC computing centers, cloud resources, and distributed systems in general. The access to… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: 12 pages, 3 apendies, 23 Figures, 4 Tables

  7. arXiv:2202.13874  [pdf, other

    cs.LG cs.DC

    Time Series Analysis of Blockchain-Based Cryptocurrency Price Changes

    Authors: Jacques Fleischer, Gregor von Laszewski, Carlos Theran, Yohn Jairo Parra Bautista

    Abstract: In this paper we apply neural networks and Artificial Intelligence (AI) to historical records of high-risk cryptocurrency coins to train a prediction model that guesses their price. This paper's code contains Jupyter notebooks, one of which outputs a timeseries graph of any cryptocurrency price once a CSV file of the historical data is inputted into the program. Another Jupyter notebook trains an… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  8. arXiv:2108.06001  [pdf, other

    cs.DC cs.AI

    HPTMT Parallel Operators for High Performance Data Science & Data Engineering

    Authors: Vibhatha Abeykoon, Supun Kamburugamuve, Chathura Widanage, Niranda Perera, Ahmet Uyar, Thejaka Amila Kanewala, Gregor von Laszewski, Geoffrey Fox

    Abstract: Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field ha… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  9. arXiv:2107.12807  [pdf, other

    cs.DC cs.AI

    HPTMT: Operator-Based Architecture for Scalable High-Performance Data-Intensive Frameworks

    Authors: Supun Kamburugamuve, Chathura Widanage, Niranda Perera, Vibhatha Abeykoon, Ahmet Uyar, Thejaka Amila Kanewala, Gregor von Laszewski, Geoffrey Fox

    Abstract: Data-intensive applications impact many domains, and their steadily increasing size and complexity demands high-performance, highly usable environments. We integrate a set of ideas developed in various data science and data engineering frameworks. They employ a set of operators on specific data abstractions that include vectors, matrices, tensors, graphs, and tables. Our key concepts are inspired… ▽ More

    Submitted 29 July, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

  10. arXiv:2010.03757  [pdf, other

    cs.LG stat.ML

    AICov: An Integrative Deep Learning Framework for COVID-19 Forecasting with Population Covariates

    Authors: Geoffrey C. Fox, Gregor von Laszewski, Fugang Wang, Saumyadipta Pyne

    Abstract: The COVID-19 pandemic has profound global consequences on health, economic, social, political, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of AICov, which provides an integrative deep learning framework for COVID-19 forecasting wi… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: 25 pages, 4 tabkes, 19 figures