Challenges and Opportunities for Machine Learning Classification of Behavior and Mental State from Images
Authors:
Peter Washington,
Cezmi Onur Mutlu,
Aaron Kline,
Kelley Paskov,
Nate Tyler Stockham,
Brianna Chrisman,
Nick Deveau,
Mourya Surhabi,
Nick Haber,
Dennis P. Wall
Abstract:
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, a…
▽ More
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, and training a convolutional neural network, there are several pain points which arise when attempting this process for behavioral phenoty**. Here, we discuss the challenges and corresponding opportunities in this space, including handling heterogeneous data, avoiding biased models, labeling massive and repetitive data sets, working with ambiguous or compound class labels, managing privacy concerns, creating appropriate representations, and personalizing models. We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning. We highlight at least some of the machine learning advancements needed for imaging classifiers to detect human social cues successfully and reliably.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study
Authors:
Peter Washington,
Haik Kalantarian,
John Kent,
Arman Husic,
Aaron Kline,
Emilie Leblanc,
Cathy Hou,
Onur Cezmi Mutlu,
Kaitlyn Dunlap,
Yordan Penev,
Maya Varma,
Nate Tyler Stockham,
Brianna Chrisman,
Kelley Paskov,
Min Woo Sun,
Jae-Yoon Jung,
Catalin Voss,
Nick Haber,
Dennis Paul Wall
Abstract:
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emot…
▽ More
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emotion-enriched images to boost the performance of automatic child emotion recognition models to a level closer to what will be needed for digital health care approaches. Methods: We leveraged our prototype therapeutic smartphone game, GuessWhat, which was designed in large part for children with developmental and behavioral conditions, to gamify the secure collection of video data of children expressing a variety of emotions prompted by the game. Independently, we created a secure web interface to gamify the human labeling effort, called HollywoodSquares, tailored for use by any qualified labeler. We gathered and labeled 2155 videos, 39,968 emotion frames, and 106,001 labels on all images. With this drastically expanded pediatric emotion-centric database (>30 times larger than existing public pediatric emotion data sets), we trained a convolutional neural network (CNN) computer vision classifier of happy, sad, surprised, fearful, angry, disgust, and neutral expressions evoked by children. Results: The classifier achieved a 66.9% balanced accuracy and 67.4% F1-score on the entirety of the Child Affective Facial Expression (CAFE) as well as a 79.1% balanced accuracy and 78% F1-score on CAFE Subset A, a subset containing at least 60% human agreement on emotions labels. This performance is at least 10% higher than all previously developed classifiers evaluated against CAFE, the best of which reached a 56% balanced accuracy even when combining "anger" and "disgust" into a single class.
△ Less
Submitted 3 June, 2024; v1 submitted 15 December, 2020;
originally announced December 2020.