A Topological-Framework to Improve Analysis of Machine Learning Model Performance
Authors:
Henry Kvinge,
Colby Wight,
Sarah Akers,
Scott Howland,
Woongjo Choi,
Xiaolong Ma,
Luke Gosink,
Elizabeth Jurrus,
Keerti Kappagantula,
Tegan H. Emerson
Abstract:
As both machine learning models and the datasets on which they are evaluated have grown in size and complexity, the practice of using a few summary statistics to understand model performance has become increasingly problematic. This is particularly true in real-world scenarios where understanding model failure on certain subpopulations of the data is of critical importance. In this paper we propos…
▽ More
As both machine learning models and the datasets on which they are evaluated have grown in size and complexity, the practice of using a few summary statistics to understand model performance has become increasingly problematic. This is particularly true in real-world scenarios where understanding model failure on certain subpopulations of the data is of critical importance. In this paper we propose a topological framework for evaluating machine learning models in which a dataset is treated as a "space" on which a model operates. This provides us with a principled way to organize information about model performance at both the global level (over the entire test set) and also the local level (on specific subpopulations). Finally, we describe a topological data structure, presheaves, which offer a convenient way to store and analyze model performance between different subpopulations.
△ Less
Submitted 9 July, 2021;
originally announced July 2021.