Comparing Workflow Application Designs for High Resolution Satellite Image Analysis
Authors:
Aymen Al-Saadi,
Ioannis Paraskevakos,
Bento Collares Gonçalves,
Heather J. Lynch,
Shantenu Jha,
Matteo Turilli
Abstract:
Very High Resolution satellite and aerial imagery are used to monitor and conduct large scale surveys of ecological systems. Convolutional Neural Networks have successfully been employed to analyze such imagery to detect large animals and salient features. As the datasets increase in volume and number of images, utilizing High Performance Computing resources becomes necessary. In this paper, we in…
▽ More
Very High Resolution satellite and aerial imagery are used to monitor and conduct large scale surveys of ecological systems. Convolutional Neural Networks have successfully been employed to analyze such imagery to detect large animals and salient features. As the datasets increase in volume and number of images, utilizing High Performance Computing resources becomes necessary. In this paper, we investigate three task-parallel, data-driven workflow designs to support imagery analysis pipelines with heterogeneous tasks on HPC. We analyze the capabilities of each design when processing datasets from two use cases for a total of 4,672 satellite and aerial images, and 8.35 TB of data. We experimentally model the execution time of the tasks of the image processing pipelines. We perform experiments to characterize the resource utilization, total time to completion, and overheads of each design. Based on the model, overhead and utilization analysis, we show which design is best suited to scientific pipelines with similar characteristics.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
Workflow Design Analysis for High Resolution Satellite Image Analysis
Authors:
Ioannis Paraskevakos,
Matteo Turilli,
Bento Collares Gonçalves,
Heather J. Lynch,
Shantenu Jha
Abstract:
Ecological sciences are using imagery from a variety of sources to monitor and survey populations and ecosystems. Very High Resolution (VHR) satellite imagery provide an effective dataset for large scale surveys. Convolutional Neural Networks have successfully been employed to analyze such imagery and detect large animals. As the datasets increase in volume, O(TB), and number of images, O(1k), uti…
▽ More
Ecological sciences are using imagery from a variety of sources to monitor and survey populations and ecosystems. Very High Resolution (VHR) satellite imagery provide an effective dataset for large scale surveys. Convolutional Neural Networks have successfully been employed to analyze such imagery and detect large animals. As the datasets increase in volume, O(TB), and number of images, O(1k), utilizing High Performance Computing (HPC) resources becomes necessary. In this paper, we investigate a task-parallel data-driven workflows design to support imagery analysis pipelines with heterogeneous tasks on HPC. We analyze the capabilities of each design when processing a dataset of 3,000 VHR satellite images for a total of 4~TB. We experimentally model the execution time of the tasks of the image processing pipeline. We perform experiments to characterize the resource utilization, total time to completion, and overheads of each design. Based on the model, overhead and utilization analysis, we show which design approach to is best suited in scientific pipelines with similar characteristics.
△ Less
Submitted 29 January, 2020; v1 submitted 23 May, 2019;
originally announced May 2019.