Search | arXiv e-print repository

Foundation Models for Generalist Geospatial Artificial Intelligence

Authors: Johannes Jakubik, Sujit Roy, C. E. Phillips, Paolo Fraccaro, Denys Godwin, Bianca Zadrozny, Daniela Szwarcman, Carlos Gomes, Gabby Nyirjesy, Blair Edwards, Daiki Kimura, Naomi Simumba, Linsong Chu, S. Karthik Mukkavilli, Devyani Lambhate, Kamal Das, Ran**i Bangalore, Dario Oliveira, Michal Muszynski, Kumar Ankur, Muthukumaran Ramasubramanian, Iksha Gurung, Sam Khallaghi, Hanxi, Li , et al. (8 additional authors not shown)

Abstract: Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framewo… ▽ More Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive geospatial data. We have utilized this framework to create Prithvi, a transformer-based geospatial foundational model pre-trained on more than 1TB of multispectral satellite imagery from the Harmonized Landsat-Sentinel 2 (HLS) dataset. Our study demonstrates the efficacy of our framework in successfully fine-tuning Prithvi to a range of Earth observation tasks that have not been tackled by previous work on foundation models involving multi-temporal cloud gap imputation, flood map**, wildfire scar segmentation, and multi-temporal crop segmentation. Our experiments show that the pre-trained model accelerates the fine-tuning process compared to leveraging randomly initialized weights. In addition, pre-trained Prithvi compares well against the state-of-the-art, e.g., outperforming a conditional GAN model in multi-temporal cloud imputation by up to 5pp (or 5.7%) in the structural similarity index. Finally, due to the limited availability of labeled data in the field of Earth observation, we gradually reduce the quantity of available labeled data for refining the model to evaluate data efficiency and demonstrate that data can be decreased significantly without affecting the model's accuracy. The pre-trained 100 million parameter model and corresponding fine-tuning workflows have been released publicly as open source contributions to the global Earth sciences community through Hugging Face. △ Less

Submitted 8 November, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

arXiv:2309.03229 [pdf, other]

Which algorithm to select in sports timetabling?

Authors: David Van Bulck, Dries Goossens, Jan-Patrick Clarner, Angelos Dimitsas, George H. G. Fonseca, Carlos Lamas-Fernandez, Martin Mariusz Lester, Jaap Pedersen, Antony E. Phillips, Roberto Maria Rosati

Abstract: Any sports competition needs a timetable, specifying when and where teams meet each other. The recent International Timetabling Competition (ITC2021) on sports timetabling showed that, although it is possible to develop general algorithms, the performance of each algorithm varies considerably over the problem instances. This paper provides an instance space analysis for sports timetabling, resulti… ▽ More Any sports competition needs a timetable, specifying when and where teams meet each other. The recent International Timetabling Competition (ITC2021) on sports timetabling showed that, although it is possible to develop general algorithms, the performance of each algorithm varies considerably over the problem instances. This paper provides an instance space analysis for sports timetabling, resulting in powerful insights into the strengths and weaknesses of eight state-of-the-art algorithms. Based on machine learning techniques, we propose an algorithm selection system that predicts which algorithm is likely to perform best when given the characteristics of a sports timetabling problem instance. Furthermore, we identify which characteristics are important in making that prediction, providing insights in the performance of the algorithms, and suggestions to further improve them. Finally, we assess the empirical hardness of the instances. Our results are based on large computational experiments involving about 50 years of CPU time on more than 500 newly generated problem instances. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: This is a non-peer-reviewed working paper

arXiv:2110.04729 [pdf, other]

Humans' Assessment of Robots as Moral Regulators: Importance of Perceived Fairness and Legitimacy

Authors: Boyoung Kim, Elizabeth Phillips

Abstract: Previous research has shown that the fairness and the legitimacy of a moral decision-maker are important for people's acceptance of and compliance with the decision-maker. As technology rapidly advances, there have been increasing hopes and concerns about building artificially intelligent entities that are designed to intervene against norm violations. However, it is unclear how people would perce… ▽ More Previous research has shown that the fairness and the legitimacy of a moral decision-maker are important for people's acceptance of and compliance with the decision-maker. As technology rapidly advances, there have been increasing hopes and concerns about building artificially intelligent entities that are designed to intervene against norm violations. However, it is unclear how people would perceive artificial moral regulators that impose punishment on human wrongdoers. Grounded in theories of psychology and law, we predict that the perceived fairness of punishment imposed by a robot would increase the legitimacy of the robot functioning as a moral regulator, which would in turn, increase people's willingness to accept and comply with the robot's decisions. We close with a conceptual framework for building a robot moral regulator that successfully can regulate norm violations. △ Less

Submitted 7 October, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

Comments: Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836)

Report number: AIHRI/2021/52

arXiv:2110.03071 [pdf, other]

Two Many Cooks: Understanding Dynamic Human-Agent Team Communication and Perception Using Overcooked 2

Authors: Andres Rosero, Faustina Dinh, Ewart J. de Visser, Tyler Shaw, Elizabeth Phillips

Abstract: This paper describes a research study that aims to investigate changes in effective communication during human-AI collaboration with special attention to the perception of competence among team members and varying levels of task load placed on the team. We will also investigate differences between human-human teamwork and human-agent teamwork. Our project will measure differences in the communicat… ▽ More This paper describes a research study that aims to investigate changes in effective communication during human-AI collaboration with special attention to the perception of competence among team members and varying levels of task load placed on the team. We will also investigate differences between human-human teamwork and human-agent teamwork. Our project will measure differences in the communication quality, team perception and performance of a human actor playing a Commercial Off - The Shelf game (COTS) with either a human teammate or a simulated AI teammate under varying task load. We argue that the increased cognitive workload associated with increases task load will be negatively associated with team performance and have a negative impact on communication quality. In addition, we argue that positive team perceptions will have a positive impact on the communication quality between a user and teammate in both the human and AI teammate conditions. This project will offer more refined insights on Human - AI relationship dynamics in collaborative tasks by considering communication quality, team perception, and performance under increasing cognitive workload. △ Less

Submitted 6 October, 2021; originally announced October 2021.

Comments: Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836)

Report number: AIHRI/2021/28

arXiv:2001.05234 [pdf, other]

doi 10.1016/j.camwa.2020.01.002

GPU acceleration of CaNS for massively-parallel direct numerical simulations of canonical fluid flows

Authors: Pedro Costa, Everett Phillips, Luca Brandt, Massimiliano Fatica

Abstract: This work presents the GPU acceleration of the open-source code CaNS for very fast massively-parallel simulations of canonical fluid flows. The distinct feature of the many-CPU Navier-Stokes solver in CaNS is its fast direct solver for the second-order finite-difference Poisson equation, based on the method of eigenfunction expansions. The solver implements all the boundary conditions valid for th… ▽ More This work presents the GPU acceleration of the open-source code CaNS for very fast massively-parallel simulations of canonical fluid flows. The distinct feature of the many-CPU Navier-Stokes solver in CaNS is its fast direct solver for the second-order finite-difference Poisson equation, based on the method of eigenfunction expansions. The solver implements all the boundary conditions valid for this type of problems in a unified framework. Here, we extend the solver for GPU-accelerated clusters using CUDA Fortran. The porting makes extensive use of CUF kernels and has been greatly simplified by the unified memory feature of CUDA Fortran, which handles the data migration between host (CPU) and device (GPU) without defining new arrays in the source code. The overall implementation has been validated against benchmark data for turbulent channel flow and its performance assessed on a NVIDIA DGX-2 system (16 Tesla V100 32Gb, connected with NVLink via NVSwitch). The wall-clock time per time step of the GPU-accelerated implementation is impressively small when compared to its CPU implementation on state-of-the-art many-CPU clusters, as long as the domain partitioning is sufficiently small that the data resides mostly on the GPUs. The implementation has been made freely available and open-source under the terms of an MIT license. △ Less

Submitted 2 October, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Journal ref: Computers & Mathematics with Applications 81 (2021) 502-511

arXiv:1908.09470 [pdf, other]

Local Graph Stability in Exponential Family Random Graph Models

Authors: Yue Yu, Gianmarc Grazioli, Nolan E. Phillips, Carter T. Butts

Abstract: Exponential family Random Graph Models (ERGMs) can be viewed as expressing a probability distribution on graphs arising from the action of competing social forces that make ties more or less likely, depending on the state of the rest of the graph. Such forces often lead to a complex pattern of dependence among edges, with non-trivial large-scale structures emerging from relatively simple local mec… ▽ More Exponential family Random Graph Models (ERGMs) can be viewed as expressing a probability distribution on graphs arising from the action of competing social forces that make ties more or less likely, depending on the state of the rest of the graph. Such forces often lead to a complex pattern of dependence among edges, with non-trivial large-scale structures emerging from relatively simple local mechanisms. While this provides a powerful tool for probing macro-micro connections, much remains to be understood about how local forces shape global outcomes. One simple question of this type is that of the conditions needed for social forces to stabilize a particular structure. We refer to this property as local stability and seek a general means of identifying the set of parameters under which a target graph is locally stable with respect to a set of alternatives. Here, we provide a complete characterization of the region of the parameter space inducing local stability, showing it to be the interior of a convex cone whose faces can be derived from the change-scores of the sufficient statistics vis-a-vis the alternative structures. As we show, local stability is a necessary but not sufficient condition for more general notions of stability, the latter of which can be explored more efficiently by using the ``stable cone'' within the parameter space as a starting point. In addition, we show how local stability can be used to determine whether a fitted model implies that an observed structure would be expected to arise primarily from the action of social forces, versus by merit of the model permitting a large number of high probability structures, of which the observed structure is one. We also use our approach to identify the dyads within a given structure that are the least stable, and hence predicted to have the highest probability of changing over time. △ Less

Submitted 26 August, 2019; originally announced August 2019.

arXiv:1810.01993 [pdf, other]

Exascale Deep Learning for Climate Analytics

Authors: Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston

Abstract: We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parall… ▽ More We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively. △ Less

Submitted 3 October, 2018; originally announced October 2018.

Comments: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, USA

arXiv:1708.03655 [pdf, other]

Communicating Robot Arm Motion Intent Through Mixed Reality Head-mounted Displays

Authors: Eric Rosen, David Whitney, Elizabeth Phillips, Gary Chien, James Tompkin, George Konidaris, Stefanie Tellex

Abstract: Efficient motion intent communication is necessary for safe and collaborative work environments with collocated humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and social cues. However, robots often have difficulty efficiently communicating their motion intent to humans via these methods. Many existing methods for robot motion intent co… ▽ More Efficient motion intent communication is necessary for safe and collaborative work environments with collocated humans and robots. Humans efficiently communicate their motion intent to other humans through gestures, gaze, and social cues. However, robots often have difficulty efficiently communicating their motion intent to humans via these methods. Many existing methods for robot motion intent communication rely on 2D displays, which require the human to continually pause their work and check a visualization. We propose a mixed reality head-mounted display visualization of the proposed robot motion over the wearer's real-world view of the robot and its environment. To evaluate the effectiveness of this system against a 2D display visualization and against no visualization, we asked 32 participants to labeled different robot arm motions as either colliding or non-colliding with blocks on a table. We found a 16% increase in accuracy with a 62% decrease in the time it took to complete the task compared to the next best system. This demonstrates that a mixed-reality HMD allows a human to more quickly and accurately tell where the robot is going to move than the compared baselines. △ Less

Submitted 11 August, 2017; originally announced August 2017.

Showing 1–8 of 8 results for author: Phillips, E