Showing 1–1 of 1 results for author: Niemann, J

Search v0.5.6 released 2020-02-24

arXiv:2303.01998 [pdf, other]

cs.SE cs.AI

MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

Authors: Katherine R. Maffey, Kyle Dotterrer, Jennifer Niemann, Iain Cruickshank, Grace A. Lewis, Christian Kästner

Abstract: Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles… ▽ More Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results. △ Less

Submitted 3 March, 2023; originally announced March 2023.

Comments: Accepted to the NIER Track of the 45th International Conference on Software Engineering (ICSE 2023)

Search v0.5.6 released 2020-02-24