-
MLCommons Cloud Masking Benchmark with Early Stop**
Authors:
Varshitha Chennamsetti,
Gregor von Laszewski,
Ruochen Gu,
Laiba Mehnaz,
Juri Papay,
Samuel Jackson,
Jeyan Thiyagalingam,
Sergey V. Samsonau,
Geoffrey C. Fox
Abstract:
In this paper, we report on work performed for the MLCommons Science Working Group on the cloud masking benchmark. MLCommons is a consortium that develops and maintains several scientific benchmarks that aim to benefit developments in AI. The benchmarks are conducted on the High Performance Computing (HPC) Clusters of New York University and University of Virginia, as well as a commodity desktop.…
▽ More
In this paper, we report on work performed for the MLCommons Science Working Group on the cloud masking benchmark. MLCommons is a consortium that develops and maintains several scientific benchmarks that aim to benefit developments in AI. The benchmarks are conducted on the High Performance Computing (HPC) Clusters of New York University and University of Virginia, as well as a commodity desktop. We provide a description of the cloud masking benchmark, as well as a summary of our submission to MLCommons on the benchmark experiment we conducted. It includes a modification to the reference implementation of the cloud masking benchmark enabling early stop**. This benchmark is executed on the NYU HPC through a custom batch script that runs the various experiments through the batch queuing system while allowing for variation on the number of epochs trained. Our submission includes the modified code, a custom batch script to modify epochs, documentation, and the benchmark results. We report the highest accuracy (scientific metric) and the average time taken (performance metric) for training and inference that was achieved on NYU HPC Greene. We also provide a comparison of the compute capabilities between different systems by running the benchmark for one epoch. Our submission can be found in a Globus repository that is accessible to MLCommons Science Working Group.
△ Less
Submitted 30 May, 2024; v1 submitted 11 December, 2023;
originally announced January 2024.
-
Whitepaper on Reusable Hybrid and Multi-Cloud Analytics Service Framework
Authors:
Gregor von Laszewski,
Wo Chang,
Russell Reinsch,
Olivera Kotevska,
Ali Karimi,
Abdul Rahman Sattar,
Garry Mazzaferro,
Geoffrey C. Fox
Abstract:
Over the last several years, the computation landscape for conducting data analytics has completely changed. While in the past, a lot of the activities have been undertaken in isolation by companies, and research institutions, today's infrastructure constitutes a wealth of services offered by a variety of providers that offer opportunities for reuse, and interactions while leveraging service colla…
▽ More
Over the last several years, the computation landscape for conducting data analytics has completely changed. While in the past, a lot of the activities have been undertaken in isolation by companies, and research institutions, today's infrastructure constitutes a wealth of services offered by a variety of providers that offer opportunities for reuse, and interactions while leveraging service collaboration, and service cooperation.
This document focuses on expanding analytics services to develop a framework for reusable hybrid multi-service data analytics. It includes (a) a short technology review that explicitly targets the intersection of hybrid multi-provider analytics services, (b) a small motivation based on use cases we looked at, (c) enhancing the concepts of services to showcase how hybrid, as well as multi-provider services can be integrated and reused via the proposed framework, (d) address analytics service composition, and (e) integrate container technologies to achieve state-of-the-art analytics service deployment
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Hybrid Reusable Computational Analytics Workflow Management with Cloudmesh
Authors:
Gregor von Laszewski,
J. P. Fleischer,
Geoffrey C. Fox
Abstract:
In this paper, we summarize our effort to create and utilize a simple framework to coordinate computational analytics tasks with the help of a workflow system. Our design is based on a minimalistic approach while at the same time allowing to access computational resources offered through the owner's computer, HPC computing centers, cloud resources, and distributed systems in general. The access to…
▽ More
In this paper, we summarize our effort to create and utilize a simple framework to coordinate computational analytics tasks with the help of a workflow system. Our design is based on a minimalistic approach while at the same time allowing to access computational resources offered through the owner's computer, HPC computing centers, cloud resources, and distributed systems in general. The access to this framework includes a simple GUI for monitoring and managing the workflow, a REST service, a command line interface, as well as a Python interface. The resulting framework was developed for several examples targeting benchmarks of AI applications on hybrid compute resources and as an educational tool for teaching scientists and students sophisticated concepts to execute computations on resources ranging from a single computer to many thousands of computers as part of on-premise and cloud infrastructure. We demonstrate the usefulness of the tool on a number of examples. The code is available as an open-source project in GitHub and is based on an easy-to-enhance tool called cloudmesh.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.
-
Spatiotemporal Pattern Mining for Nowcasting Extreme Earthquakes in Southern California
Authors:
Bo Feng,
Geoffrey C. Fox
Abstract:
Geoscience and seismology have utilized the most advanced technologies and equipment to monitor seismic events globally from the past few decades. With the enormous amount of data, modern GPU-powered deep learning presents a promising approach to analyze data and discover patterns. In recent years, there are plenty of successful deep learning models for picking seismic waves. However, forecasting…
▽ More
Geoscience and seismology have utilized the most advanced technologies and equipment to monitor seismic events globally from the past few decades. With the enormous amount of data, modern GPU-powered deep learning presents a promising approach to analyze data and discover patterns. In recent years, there are plenty of successful deep learning models for picking seismic waves. However, forecasting extreme earthquakes, which can cause disasters, is still an underdeveloped topic in history. Relevant research in spatiotemporal dynamics mining and forecasting has revealed some successful predictions, a crucial topic in many scientific research fields. Most studies of them have many successful applications of using deep neural networks. In Geology and Earth science studies, earthquake prediction is one of the world's most challenging problems, about which cutting-edge deep learning technologies may help discover some valuable patterns. In this project, we propose a deep learning modeling approach, namely \tseqpre, to mine spatiotemporal patterns from data to nowcast extreme earthquakes by discovering visual dynamics in regional coarse-grained spatial grids over time. In this modeling approach, we use synthetic deep learning neural networks with domain knowledge in geoscience and seismology to exploit earthquake patterns for prediction using convolutional long short-term memory neural networks. Our experiments show a strong correlation between location prediction and magnitude prediction for earthquakes in Southern California. Ablation studies and visualization validate the effectiveness of the proposed modeling method.
△ Less
Submitted 11 September, 2021; v1 submitted 20 December, 2020;
originally announced December 2020.
-
CryptoGRU: Low Latency Privacy-Preserving Text Analysis With GRU
Authors:
Bo Feng,
Qian Lou,
Lei Jiang,
Geoffrey C. Fox
Abstract:
Billions of text analysis requests containing private emails, personal text messages, and sensitive online reviews, are processed by recurrent neural networks (RNNs) deployed on public clouds every day. Although prior secure networks combine homomorphic encryption (HE) and garbled circuit (GC) to preserve users' privacy, naively adopting the HE and GC hybrid technique to implement RNNs suffers fro…
▽ More
Billions of text analysis requests containing private emails, personal text messages, and sensitive online reviews, are processed by recurrent neural networks (RNNs) deployed on public clouds every day. Although prior secure networks combine homomorphic encryption (HE) and garbled circuit (GC) to preserve users' privacy, naively adopting the HE and GC hybrid technique to implement RNNs suffers from long inference latency due to slow activation functions. In this paper, we present a HE and GC hybrid gated recurrent unit (GRU) network, CryptoGRU, for low-latency secure inferences. CryptoGRU replaces computationally expensive GC-based $tanh$ with fast GC-based $ReLU$, and then quantizes $sigmoid$ and $ReLU$ with a smaller bit length to accelerate activations in a GRU. We evaluate CryptoGRU with multiple GRU models trained on 4 public datasets. Experimental results show CryptoGRU achieves top-notch accuracy and improves the secure inference latency by up to $138\times$ over one of state-of-the-art secure networks on the Penn Treebank dataset.
△ Less
Submitted 9 September, 2021; v1 submitted 22 October, 2020;
originally announced October 2020.
-
AICov: An Integrative Deep Learning Framework for COVID-19 Forecasting with Population Covariates
Authors:
Geoffrey C. Fox,
Gregor von Laszewski,
Fugang Wang,
Saumyadipta Pyne
Abstract:
The COVID-19 pandemic has profound global consequences on health, economic, social, political, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of AICov, which provides an integrative deep learning framework for COVID-19 forecasting wi…
▽ More
The COVID-19 pandemic has profound global consequences on health, economic, social, political, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of AICov, which provides an integrative deep learning framework for COVID-19 forecasting with population covariates, some of which may serve as putative risk factors. We have integrated multiple different strategies into AICov, including the ability to use deep learning strategies based on LSTM and even modeling. To demonstrate our approach, we have conducted a pilot that integrates population covariates from multiple sources. Thus, AICov not only includes data on COVID-19 cases and deaths but, more importantly, the population's socioeconomic, health and behavioral risk factors at a local level. The compiled data are fed into AICov, and thus we obtain improved prediction by integration of the data to our model as compared to one that only uses case and death data.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Solving Newton's Equations of Motion with Large Timesteps using Recurrent Neural Networks based Operators
Authors:
JCS Kadupitiya,
Geoffrey C. Fox,
Vikram Jadhao
Abstract:
Classical molecular dynamics simulations are based on solving Newton's equations of motion. Using a small timestep, numerical integrators such as Verlet generate trajectories of particles as solutions to Newton's equations. We introduce operators derived using recurrent neural networks that accurately solve Newton's equations utilizing sequences of past trajectory data, and produce energy-conservi…
▽ More
Classical molecular dynamics simulations are based on solving Newton's equations of motion. Using a small timestep, numerical integrators such as Verlet generate trajectories of particles as solutions to Newton's equations. We introduce operators derived using recurrent neural networks that accurately solve Newton's equations utilizing sequences of past trajectory data, and produce energy-conserving dynamics of particles using timesteps up to 4000 times larger compared to the Verlet timestep. We demonstrate significant speedup in many example problems including 3D systems of up to 16 particles.
△ Less
Submitted 13 December, 2021; v1 submitted 12 April, 2020;
originally announced April 2020.
-
Glyph: Fast and Accurately Training Deep Neural Networks on Encrypted Data
Authors:
Qian Lou,
Bo Feng,
Geoffrey C. Fox,
Lei Jiang
Abstract:
Big data is one of the cornerstones to enabling and training deep neural networks (DNNs). Because of the lack of expertise, to gain benefits from their data, average users have to rely on and upload their private data to big data companies they may not trust. Due to the compliance, legal, or privacy constraints, most users are willing to contribute only their encrypted data, and lack interests or…
▽ More
Big data is one of the cornerstones to enabling and training deep neural networks (DNNs). Because of the lack of expertise, to gain benefits from their data, average users have to rely on and upload their private data to big data companies they may not trust. Due to the compliance, legal, or privacy constraints, most users are willing to contribute only their encrypted data, and lack interests or resources to join the training of DNNs in cloud. To train a DNN on encrypted data in a completely non-interactive way, a recent work proposes a fully homomorphic encryption (FHE)-based technique implementing all activations in the neural network by \textit{Brakerski-Gentry-Vaikuntanathan (BGV)}-based lookup tables. However, such inefficient lookup-table-based activations significantly prolong the training latency of privacy-preserving DNNs.
In this paper, we propose, Glyph, a FHE-based scheme to fast and accurately train DNNs on encrypted data by switching between TFHE (Fast Fully Homomorphic Encryption over the Torus) and BGV cryptosystems. Glyph uses logic-operation-friendly TFHE to implement nonlinear activations, while adopts vectorial-arithmetic-friendly BGV to perform multiply-accumulation (MAC) operations. Glyph further applies transfer learning on the training of DNNs to improve the test accuracy and reduce the number of MAC operations between ciphertext and ciphertext in convolutional layers. Our experimental results show Glyph obtains the state-of-the-art test accuracy, but reduces the training latency by $99\%$ over the prior FHE-based technique on various encrypted datasets.
△ Less
Submitted 21 October, 2020; v1 submitted 16 November, 2019;
originally announced November 2019.
-
Parallel Performance of Molecular Dynamics Trajectory Analysis
Authors:
Mahzad Khoshlessan,
Ioannis Paraskevakos,
Geoffrey C. Fox,
Shantenu Jha,
Oliver Beckstein
Abstract:
The performance of biomolecular molecular dynamics simulations has steadily increased on modern high performance computing resources but acceleration of the analysis of the output trajectories has lagged behind so that analyzing simulations is becoming a bottleneck. To close this gap, we studied the performance of parallel trajectory analysis with MPI and the Python MDAnalysis library on three dif…
▽ More
The performance of biomolecular molecular dynamics simulations has steadily increased on modern high performance computing resources but acceleration of the analysis of the output trajectories has lagged behind so that analyzing simulations is becoming a bottleneck. To close this gap, we studied the performance of parallel trajectory analysis with MPI and the Python MDAnalysis library on three different XSEDE supercomputers where trajectories were read from a Lustre parallel file system. Strong scaling performance was impeded by stragglers, MPI processes that were slower than the typical process. Stragglers were less prevalent for compute-bound workloads, thus pointing to file reading as a bottleneck for scaling. However, a more complicated picture emerged in which both the computation and the data ingestion exhibited close to ideal strong scaling behavior whereas stragglers were primarily caused by either large MPI communication costs or long times to open the single shared trajectory file. We improved overall strong scaling performance by either subfiling (splitting the trajectory into separate files) or MPI-IO with Parallel HDF5 trajectory files. The parallel HDF5 approach resulted in near ideal strong scaling on up to 384 cores (16 nodes), thus reducing trajectory analysis times by two orders of magnitude compared to the serial approach.
△ Less
Submitted 27 March, 2020; v1 submitted 28 June, 2019;
originally announced July 2019.
-
Task-parallel Analysis of Molecular Dynamics Trajectories
Authors:
Ioannis Paraskevakos,
Andre Luckow,
Mahzad Khoshlessan,
George Chantzialexiou,
Thomas E. Cheatham,
Oliver Beckstein,
Geoffrey C. Fox,
Shantenu Jha
Abstract:
Different parallel frameworks for implementing data analysis applications have been proposed by the HPC and Big Data communities. In this paper, we investigate three task-parallel frameworks: Spark, Dask and RADICAL-Pilot with respect to their ability to support data analytics on HPC resources and compare them with MPI. We investigate the data analysis requirements of Molecular Dynamics (MD) simul…
▽ More
Different parallel frameworks for implementing data analysis applications have been proposed by the HPC and Big Data communities. In this paper, we investigate three task-parallel frameworks: Spark, Dask and RADICAL-Pilot with respect to their ability to support data analytics on HPC resources and compare them with MPI. We investigate the data analysis requirements of Molecular Dynamics (MD) simulations which are significant consumers of supercomputing cycles, producing immense amounts of data. A typical large-scale MD simulation of a physical system of O(100k) atoms over μsecs can produce from O(10) GB to O(1000) GBs of data. We propose and evaluate different approaches for parallelization of a representative set of MD trajectory analysis algorithms, in particular the computation of path similarity and leaflet identification. We evaluate Spark, Dask and RADICAL-Pilot with respect to their abstractions and runtime engine capabilities to support these algorithms. We provide a conceptual basis for comparing and understanding different frameworks that enable users to select the optimal system for each application. We also provide a quantitative performance analysis of the different algorithms across the three frameworks.
△ Less
Submitted 10 June, 2018; v1 submitted 23 January, 2018;
originally announced January 2018.
-
Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction
Authors:
Mingze Xu,
Chenyou Fan,
John D Paden,
Geoffrey C Fox,
David J Crandall
Abstract:
Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical. It is less clear how well these techniques may apply on structured prediction problems where fine-grained output with high precision is required, such as in scien…
▽ More
Deep learning methods have surpassed the performance of traditional techniques on a wide range of problems in computer vision, but nearly all of this work has studied consumer photos, where precisely correct output is often not critical. It is less clear how well these techniques may apply on structured prediction problems where fine-grained output with high precision is required, such as in scientific imaging domains. Here we consider the problem of segmenting echogram radar data collected from the polar ice sheets, which is challenging because segmentation boundaries are often very weak and there is a high degree of noise. We propose a multi-task spatiotemporal neural network that combines 3D ConvNets and Recurrent Neural Networks (RNNs) to estimate ice surface boundaries from sequences of tomographic radar images. We show that our model outperforms the state-of-the-art on this problem by (1) avoiding the need for hand-tuned parameters, (2) extracting multiple surfaces (ice-air and ice-bed) simultaneously, (3) requiring less non-visual metadata, and (4) being about 6 times faster.
△ Less
Submitted 20 July, 2018; v1 submitted 11 January, 2018;
originally announced January 2018.
-
Automatic Estimation of Ice Bottom Surfaces from Radar Imagery
Authors:
Mingze Xu,
David J Crandall,
Geoffrey C Fox,
John D Paden
Abstract:
Ground-penetrating radar on planes and satellites now makes it practical to collect 3D observations of the subsurface structure of the polar ice sheets, providing crucial data for understanding and tracking global climate change. But converting these noisy readings into useful observations is generally done by hand, which is impractical at a continental scale. In this paper, we propose a computer…
▽ More
Ground-penetrating radar on planes and satellites now makes it practical to collect 3D observations of the subsurface structure of the polar ice sheets, providing crucial data for understanding and tracking global climate change. But converting these noisy readings into useful observations is generally done by hand, which is impractical at a continental scale. In this paper, we propose a computer vision-based technique for extracting 3D ice-bottom surfaces by viewing the task as an inference problem on a probabilistic graphical model. We first generate a seed surface subject to a set of constraints, and then incorporate additional sources of evidence to refine it via discrete energy minimization. We evaluate the performance of the tracking algorithm on 7 topographic sequences (each with over 3000 radar images) collected from the Canadian Arctic Archipelago with respect to human-labeled ground truth.
△ Less
Submitted 20 December, 2017;
originally announced December 2017.
-
Status of Serverless Computing and Function-as-a-Service(FaaS) in Industry and Research
Authors:
Geoffrey C. Fox,
Vatche Ishakian,
Vinod Muthusamy,
Aleksander Slominski
Abstract:
This whitepaper summarizes issues raised during the First International Workshop on Serverless Computing (WoSC) 2017 held June 5th 2017 and especially in the panel and associated discussion that concluded the workshop. We also include comments from the keynote and submitted papers. A glossary at the end (section 8) defines many technical terms used in this report.
This whitepaper summarizes issues raised during the First International Workshop on Serverless Computing (WoSC) 2017 held June 5th 2017 and especially in the panel and associated discussion that concluded the workshop. We also include comments from the keynote and submitted papers. A glossary at the end (section 8) defines many technical terms used in this report.
△ Less
Submitted 26 August, 2017;
originally announced August 2017.
-
A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures
Authors:
Shantenu Jha,
Judy Qiu,
Andre Luckow,
Pradeep Mantha,
Geoffrey C. Fox
Abstract:
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large volumes of data. We analyze the ecosystems of the two prominent paradigms for data-intensive applications, hereafter referred to as the high-perform…
▽ More
Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large volumes of data. We analyze the ecosystems of the two prominent paradigms for data-intensive applications, hereafter referred to as the high-performance computing and the Apache-Hadoop paradigm. We propose a basis, common terminology and functional factors upon which to analyze the two approaches of both paradigms. We discuss the concept of "Big Data Ogres" and their facets as means of understanding and characterizing the most common application workloads found across the two paradigms. We then discuss the salient features of the two paradigms, and compare and contrast the two approaches. Specifically, we examine common implementation/approaches of these paradigms, shed light upon the reasons for their current "architecture" and discuss some typical workloads that utilize them. In spite of the significant software distinctions, we believe there is architectural similarity. We discuss the potential integration of different implementations, across the different levels and components. Our comparison progresses from a fully qualitative examination of the two paradigms, to a semi-quantitative methodology. We use a simple and broadly used Ogre (K-means clustering), characterize its performance on a range of representative platforms, covering several implementations from both paradigms. Our experiments provide an insight into the relative strengths of the two paradigms. We propose that the set of Ogres will serve as a benchmark to evaluate the two paradigms along different dimensions.
△ Less
Submitted 22 June, 2014; v1 submitted 6 March, 2014;
originally announced March 2014.