-
Examining the Effect of Implementation Factors on Deep Learning Reproducibility
Authors:
Kevin Coakley,
Christine R. Kirkpatrick,
Odd Erik Gundersen
Abstract:
Reproducing published deep learning papers to validate their conclusions can be difficult due to sources of irreproducibility. We investigate the impact that implementation factors have on the results and how they affect reproducibility of deep learning studies. Three deep learning experiments were ran five times each on 13 different hardware environments and four different software environments.…
▽ More
Reproducing published deep learning papers to validate their conclusions can be difficult due to sources of irreproducibility. We investigate the impact that implementation factors have on the results and how they affect reproducibility of deep learning studies. Three deep learning experiments were ran five times each on 13 different hardware environments and four different software environments. The analysis of the 780 combined results showed that there was a greater than 6% accuracy range on the same deterministic examples introduced from hardware or software environment variations alone. To account for these implementation factors, researchers should run their experiments multiple times in different hardware and software environments to verify their conclusions are not affected.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Engaging with Researchers and Raising Awareness of FAIR and Open Science through the FAIR+ Implementation Survey Tool (FAIRIST)
Authors:
Christine R. Kirkpatrick,
Kevin L. Coakley,
Julie Christopher,
Ines Dutra
Abstract:
Six years after the seminal paper on FAIR was published, researchers still struggle to understand how to implement FAIR. For many researchers FAIR promises long-term benefits for near-term effort, requires skills not yet acquired, and is one more thing in a long list of unfunded mandates and onerous requirements on scientists. Even for those required to or who are convinced they must make time for…
▽ More
Six years after the seminal paper on FAIR was published, researchers still struggle to understand how to implement FAIR. For many researchers FAIR promises long-term benefits for near-term effort, requires skills not yet acquired, and is one more thing in a long list of unfunded mandates and onerous requirements on scientists. Even for those required to or who are convinced they must make time for FAIR research practices, the preference is for just-in-time advice properly sized to the scientific artifacts and process. Because of the generality of most FAIR implementation guidance, it is difficult for a researcher to adjust the advice to their situation. Technological advances, especially in the area of artificial intelligence (AI) and machine learning (ML), complicate FAIR adoption as researchers and data stewards ponder how to make software, workflows, and models FAIR and reproducible. The FAIR+ Implementation Survey Tool (FAIRIST) mitigates the problem by integrating research requirements with research proposals in a systematic way. FAIRIST factors in new scholarly outputs such as nanopublications and notebooks, and the various research artifacts related to AI research (data, models, workflows, and benchmarks). Researchers step through a self-serve survey process and receive a table ready for use in their DMP and/or work plan while gaining awareness of the FAIR Principles and Open Science concepts. FAIRIST is a model that uses part of the proposal process as a way to do outreach, raise awareness of FAIR dimensions and considerations, while providing just-in-time assistance for competitive proposals.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
FAIR for AI: An interdisciplinary and international community building perspective
Authors:
E. A. Huerta,
Ben Blaiszik,
L. Catherine Brinson,
Kristofer E. Bouchard,
Daniel Diaz,
Caterina Doglioni,
Javier M. Duarte,
Murali Emani,
Ian Foster,
Geoffrey Fox,
Philip Harris,
Lukas Heinrich,
Shantenu Jha,
Daniel S. Katz,
Volodymyr Kindratenko,
Christine R. Kirkpatrick,
Kati Lassila-Perini,
Ravi K. Madduri,
Mark S. Neubauer,
Fotis E. Psomopoulos,
Avik Roy,
Oliver Rübel,
Zhizhen Zhao,
Ruike Zhu
Abstract:
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to i…
▽ More
A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to include the software, tools, algorithms, and workflows that produce data. FAIR principles are now being adapted in the context of AI models and datasets. Here, we present the perspectives, vision, and experiences of researchers from different countries, disciplines, and backgrounds who are leading the definition and adoption of FAIR principles in their communities of practice, and discuss outcomes that may result from pursuing and incentivizing FAIR AI research. The material for this report builds on the FAIR for AI Workshop held at Argonne National Laboratory on June 7, 2022.
△ Less
Submitted 1 August, 2023; v1 submitted 30 September, 2022;
originally announced October 2022.
-
DataPerf: Benchmarks for Data-Centric AI Development
Authors:
Mark Mazumder,
Colby Banbury,
Xiaozhe Yao,
Bojan Karlaš,
William Gaviria Rojas,
Sudnya Diamos,
Greg Diamos,
Lynn He,
Alicia Parrish,
Hannah Rose Kirk,
Jessica Quaye,
Charvi Rastogi,
Douwe Kiela,
David Jurado,
David Kanter,
Rafael Mosquera,
Juan Ciro,
Lora Aroyo,
Bilge Acun,
Lingjiao Chen,
Mehul Smriti Raje,
Max Bartolo,
Sabri Eyuboglu,
Amirata Ghorbani,
Emmett Goodman
, et al. (20 additional authors not shown)
Abstract:
Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing datase…
▽ More
Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing dataset benchmarks. In response, we present DataPerf, a community-led benchmark suite for evaluating ML datasets and data-centric algorithms. We aim to foster innovation in data-centric AI through competition, comparability, and reproducibility. We enable the ML community to iterate on datasets, instead of just architectures, and we provide an open, online platform with multiple rounds of challenges to support this iterative development. The first iteration of DataPerf contains five benchmarks covering a wide spectrum of data-centric techniques, tasks, and modalities in vision, speech, acquisition, debugging, and diffusion prompting, and we support hosting new contributed benchmarks from the community. The benchmarks, online evaluation platform, and baseline implementations are open source, and the MLCommons Association will maintain DataPerf to ensure long-term benefits to academia and industry.
△ Less
Submitted 13 October, 2023; v1 submitted 20 July, 2022;
originally announced July 2022.
-
Bioinformatics Computational Cluster Batch Task Profiling with Machine Learning for Failure Prediction
Authors:
Christopher Harrison,
Christine R. Kirkpatrick,
Inês Dutra
Abstract:
Motivation: Traditional computational cluster schedulers are based on user inputs and run time needs request for memory and CPU, not IO. Heavily IO bound task run times, like ones seen in many big data and bioinformatics problems, are dependent on the IO subsystems scheduling and are problematic for cluster resource scheduling. The problematic rescheduling of IO intensive and errant tasks is a los…
▽ More
Motivation: Traditional computational cluster schedulers are based on user inputs and run time needs request for memory and CPU, not IO. Heavily IO bound task run times, like ones seen in many big data and bioinformatics problems, are dependent on the IO subsystems scheduling and are problematic for cluster resource scheduling. The problematic rescheduling of IO intensive and errant tasks is a lost resource. Understanding the conditions in both successful and failed tasks and differentiating them could provide knowledge to enhancing cluster scheduling and intelligent resource optimization.
Results: We analyze a production computational cluster contributing 6.7 thousand CPU hours to research over two years. Through this analysis we develop a machine learning task profiling agent for clusters that attempts to predict failures between identically provision requested tasks.
△ Less
Submitted 22 December, 2018;
originally announced December 2018.
-
Measuring Economic Resilience to Natural Disasters with Big Economic Transaction Data
Authors:
Elena Alfaro Martinez,
Maria Hernandez Rubio,
Roberto Maestre Martinez,
Juan Murillo Arias,
Dario Patane,
Amanda Zerbe,
Robert Kirkpatrick,
Miguel Luengo-Oroz,
Amanda Zerbe
Abstract:
This research explores the potential to analyze bank card payments and ATM cash withdrawals in order to map and quantify how people are impacted by and recover from natural disasters. Our approach defines a disaster-affected community's economic recovery time as the time needed to return to baseline activity levels in terms of number of bank card payments and ATM cash withdrawals. For Hurricane Od…
▽ More
This research explores the potential to analyze bank card payments and ATM cash withdrawals in order to map and quantify how people are impacted by and recover from natural disasters. Our approach defines a disaster-affected community's economic recovery time as the time needed to return to baseline activity levels in terms of number of bank card payments and ATM cash withdrawals. For Hurricane Odile, which hit the state of Baja California Sur (BCS) in Mexico between 15 and 17 September 2014, we measured and mapped communities' economic recovery time, which ranged from 2 to 40 days in different locations. We found that -- among individuals with a bank account -- the lower the income level, the shorter the time needed for economic activity to return to normal levels. Gender differences in recovery times were also detected and quantified. In addition, our approach evaluated how communities prepared for the disaster by quantifying expenditure growth in food or gasoline before the hurricane struck. We believe this approach opens a new frontier in measuring the economic impact of disasters with high temporal and spatial resolution, and in understanding how populations bounce back and adapt.
△ Less
Submitted 27 September, 2016;
originally announced September 2016.