-
Machine Teaching: A New Paradigm for Building Machine Learning Systems
Authors:
Patrice Y. Simard,
Saleema Amershi,
David M. Chickering,
Alicia Edelman Pelton,
Soroush Ghorashi,
Christopher Meek,
Gonzalo Ramos,
**a Suh,
Johan Verwey,
Mo Wang,
John Wernsing
Abstract:
The current processes for building machine learning systems require practitioners with deep knowledge of machine learning. This significantly limits the number of machine learning systems that can be created and has led to a mismatch between the demand for machine learning systems and the ability for organizations to build them. We believe that in order to meet this growing demand for machine lear…
▽ More
The current processes for building machine learning systems require practitioners with deep knowledge of machine learning. This significantly limits the number of machine learning systems that can be created and has led to a mismatch between the demand for machine learning systems and the ability for organizations to build them. We believe that in order to meet this growing demand for machine learning systems we must significantly increase the number of individuals that can teach machines. We postulate that we can achieve this goal by making the process of teaching machines easy, fast and above all, universally accessible.
While machine learning focuses on creating new algorithms and improving the accuracy of "learners", the machine teaching discipline focuses on the efficacy of the "teachers". Machine teaching as a discipline is a paradigm shift that follows and extends principles of software engineering and programming languages. We put a strong emphasis on the teacher and the teacher's interaction with data, as well as crucial components such as techniques and design principles of interaction and visualization.
In this paper, we present our position regarding the discipline of machine teaching and articulate fundamental machine teaching principles. We also describe how, by decoupling knowledge about machine learning algorithms from the process of teaching, we can accelerate innovation and empower millions of new uses for machine learning models.
△ Less
Submitted 10 August, 2017; v1 submitted 20 July, 2017;
originally announced July 2017.
-
Analysis of a Design Pattern for Teaching with Features and Labels
Authors:
Christopher Meek,
Patrice Simard,
Xiao** Zhu
Abstract:
We study the task of teaching a machine to classify objects using features and labels. We introduce the Error-Driven-Featuring design pattern for teaching using features and labels in which a teacher prefers to introduce features only if they are needed. We analyze the potential risks and benefits of this teaching pattern through the use of teaching protocols, illustrative examples, and by providi…
▽ More
We study the task of teaching a machine to classify objects using features and labels. We introduce the Error-Driven-Featuring design pattern for teaching using features and labels in which a teacher prefers to introduce features only if they are needed. We analyze the potential risks and benefits of this teaching pattern through the use of teaching protocols, illustrative examples, and by providing bounds on the effort required for an optimal machine teacher using a linear learning algorithm, the most commonly used type of learners in interactive machine learning systems. Our analysis provides a deeper understanding of potential trade-offs of using different learning algorithms and between the effort required for featuring (creating new features) and labeling (providing labels for objects).
△ Less
Submitted 17 November, 2016;
originally announced November 2016.
-
Interactive Semantic Featuring for Text Classification
Authors:
Camille Jandot,
Patrice Simard,
Max Chickering,
David Grangier,
**a Suh
Abstract:
In text classification, dictionaries can be used to define human-comprehensible features. We propose an improvement to dictionary features called smoothed dictionary features. These features recognize document contexts instead of n-grams. We describe a principled methodology to solicit dictionary features from a teacher, and present results showing that models built using these human-comprehensibl…
▽ More
In text classification, dictionaries can be used to define human-comprehensible features. We propose an improvement to dictionary features called smoothed dictionary features. These features recognize document contexts instead of n-grams. We describe a principled methodology to solicit dictionary features from a teacher, and present results showing that models built using these human-comprehensible features are competitive with models trained with Bag of Words features.
△ Less
Submitted 23 June, 2016;
originally announced June 2016.
-
ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems
Authors:
Patrice Simard,
David Chickering,
Aparna Lakshmiratan,
Denis Charles,
Leon Bottou,
Carlos Garcia Jurado Suarez,
David Grangier,
Saleema Amershi,
Johan Verwey,
**a Suh
Abstract:
Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The learning machine leverages big data to find examples that maximize the training value of its interaction with the teacher. When the teacher is restricted to labeling examples…
▽ More
Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The learning machine leverages big data to find examples that maximize the training value of its interaction with the teacher. When the teacher is restricted to labeling examples selected by the machine, this problem is an instance of active learning. When the teacher can provide additional information to the machine (e.g., suggestions on what examples or predictive features should be used) as the learning task progresses, then the problem becomes one of interactive learning.
To accommodate the two-way communication channel needed for efficient interactive learning, the teacher and the machine need an environment that supports an interaction language. The machine can access, process, and summarize more examples than the teacher can see in a lifetime. Based on the machine's output, the teacher can revise the definition of the task or make it more precise. Both the teacher and the machine continuously learn and benefit from the interaction.
We have built a platform to (1) produce valuable and deployable models and (2) support research on both the machine learning and user interface challenges of the interactive learning problem. The platform relies on a dedicated, low-latency, distributed, in-memory architecture that allows us to construct web-scale learning machines with quick interaction speed. The purpose of this paper is to describe this architecture and demonstrate how it supports our research efforts. Preliminary results are presented as illustrations of the architecture but are not the primary focus of the paper.
△ Less
Submitted 16 September, 2014;
originally announced September 2014.
-
Counterfactual Reasoning and Learning Systems
Authors:
Léon Bottou,
Jonas Peters,
Joaquin Quiñonero-Candela,
Denis X. Charles,
D. Max Chickering,
Elon Portugaly,
Dipankar Ray,
Patrice Simard,
Ed Snelson
Abstract:
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select changes that improve both the short-term and long-term performance of such systems. This work is illustrated by experiments carried out on the ad…
▽ More
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select changes that improve both the short-term and long-term performance of such systems. This work is illustrated by experiments carried out on the ad placement system associated with the Bing search engine.
△ Less
Submitted 27 July, 2013; v1 submitted 11 September, 2012;
originally announced September 2012.