-
DCZNMaker: A Web-based Application for Multi-Attribute Utilities Analysis
Authors:
Adrienne Kline
Abstract:
DCZNMaker is a web-based application designed to streamline decision-making processes using Multi-attribute Utility Analysis (MAUA). Built with simplicity and efficiency in mind, DCZNMaker empowers users to make informed decisions among alternatives (options) by making explicit the factors (attributes) to be taken into consideration, as well as the importance (weights) and utility (location) of ea…
▽ More
DCZNMaker is a web-based application designed to streamline decision-making processes using Multi-attribute Utility Analysis (MAUA). Built with simplicity and efficiency in mind, DCZNMaker empowers users to make informed decisions among alternatives (options) by making explicit the factors (attributes) to be taken into consideration, as well as the importance (weights) and utility (location) of each attribute. The app offers a user-friendly interface, allowing individuals to input the various attributes and their associated weights and locations effortlessly. Leveraging advanced algorithms, DCZNMaker computes and presents comprehensive analyses, aiding users in understanding the relative importance of each attribute and guiding them towards optimal decisions. Several use cases are demonstrated. Whether for personal, professional, or academic use, DCZNMaker is a versatile tool adaptable to diverse decision-making scenarios. With its intuitive design and robust functionality, DCZNMaker revolutionizes decision-making processes, empowering individuals or groups of users to make well-informed choices with confidence and clarity.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
YOLO-Angio: An Algorithm for Coronary Anatomy Segmentation
Authors:
Tom Liu,
Hui Lin,
Aggelos K. Katsaggelos,
Adrienne Kline
Abstract:
Coronary angiography remains the gold standard for diagnosis of coronary artery disease, the most common cause of death worldwide. While this procedure is performed more than 2 million times annually, there remain few methods for fast and accurate automated measurement of disease and localization of coronary anatomy. Here, we present our solution to the Automatic Region-based Coronary Artery Disea…
▽ More
Coronary angiography remains the gold standard for diagnosis of coronary artery disease, the most common cause of death worldwide. While this procedure is performed more than 2 million times annually, there remain few methods for fast and accurate automated measurement of disease and localization of coronary anatomy. Here, we present our solution to the Automatic Region-based Coronary Artery Disease diagnostics using X-ray angiography images (ARCADE) challenge held at MICCAI 2023. For the artery segmentation task, our three-stage approach combines preprocessing and feature selection by classical computer vision to enhance vessel contrast, followed by an ensemble model based on YOLOv8 to propose possible vessel candidates by generating a vessel map. A final segmentation is based on a logic-based approach to reconstruct the coronary tree in a graph-based sorting method. Our entry to the ARCADE challenge placed 3rd overall. Using the official metric for evaluation, we achieved an F1 score of 0.422 and 0.4289 on the validation and hold-out sets respectively.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
StenUNet: Automatic Stenosis Detection from X-ray Coronary Angiography
Authors:
Hui Lin,
Tom Liu,
Aggelos Katsaggelos,
Adrienne Kline
Abstract:
Coronary angiography continues to serve as the primary method for diagnosing coronary artery disease (CAD), which is the leading global cause of mortality. The severity of CAD is quantified by the location, degree of narrowing (stenosis), and number of arteries involved. In current practice, this quantification is performed manually using visual inspection and thus suffers from poor inter- and int…
▽ More
Coronary angiography continues to serve as the primary method for diagnosing coronary artery disease (CAD), which is the leading global cause of mortality. The severity of CAD is quantified by the location, degree of narrowing (stenosis), and number of arteries involved. In current practice, this quantification is performed manually using visual inspection and thus suffers from poor inter- and intra-rater reliability. The MICCAI grand challenge: Automatic Region-based Coronary Artery Disease diagnostics using the X-ray angiography imagEs (ARCADE) curated a dataset with stenosis annotations, with the goal of creating an automated stenosis detection algorithm. Using a combination of machine learning and other computer vision techniques, we propose the architecture and algorithm StenUNet to accurately detect stenosis from X-ray Coronary Angiography. Our submission to the ARCADE challenge placed 3rd among all teams. We achieved an F1 score of 0.5348 on the test set, 0.0005 lower than the 2nd place.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Information Synergy Maximizes the Growth Rate of Heterogeneous Groups
Authors:
Jordan T Kemp,
Adam G Kline,
Luís MA Bettencourt
Abstract:
Collective action and group formation are fundamental behaviors among both organisms cooperating to maximize their fitness, and people forming socioeconomic organizations. Researchers have extensively explored social interaction structures via game theory and homophilic linkages, such as kin selection and scalar stress, to understand emergent cooperation in complex systems. However, we still lack…
▽ More
Collective action and group formation are fundamental behaviors among both organisms cooperating to maximize their fitness, and people forming socioeconomic organizations. Researchers have extensively explored social interaction structures via game theory and homophilic linkages, such as kin selection and scalar stress, to understand emergent cooperation in complex systems. However, we still lack a general theory capable of predicting how agents benefit from heterogeneous preferences, joint information, or skill complementarities in statistical environments. Here, we derive general statistical dynamics for the origin of cooperation based on the management of resources and pooled information. Specifically, we show how groups that optimally combine complementary agent knowledge about resources in statistical environments maximize their growth rate. We show that these advantages are quantified by the information synergy embedded in the conditional probability of environmental states given agents' signals, such that groups with a greater diversity of signals maximize their collective information. It follows that, when constraints are placed on group formation, agents must intelligently select with whom they cooperate to maximize the synergy available to their own signal. Our results show how the general properties of information underlie the optimal collective formation and dynamics of groups of heterogeneous agents across social and biological phenomena.
△ Less
Submitted 7 November, 2023; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Multi-Relevance: Coexisting but Distinct Notions of Scale in Large Systems
Authors:
Adam G. Kline,
Stephanie E. Palmer
Abstract:
Renormalization group (RG) methods are emerging as tools in biology and computer science to support the search for simplifying structure in distributions over high-dimensional spaces. We show that mixture models can be thought of as having multiple coexisting, exactly independent RG flows, each with its own notion of scale. We define this property as ``multi-relevance''. As an example, we construc…
▽ More
Renormalization group (RG) methods are emerging as tools in biology and computer science to support the search for simplifying structure in distributions over high-dimensional spaces. We show that mixture models can be thought of as having multiple coexisting, exactly independent RG flows, each with its own notion of scale. We define this property as ``multi-relevance''. As an example, we construct a model that has two distinct notions of scale, each corresponding to the state of an unobserved categorical variable. In the regime where this latent variable can be inferred using a linear classifier, the vertex expansion approach in non-perturbative RG can be applied successfully but will give different answers depending the choice of expansion point in state space. In the regime where linear estimation of the latent state fails, we show that the vertex expansion predicts a decrease in the total number of relevant couplings from four to three and does not admit a good polynomial truncation scheme. This indicates oversimplification. One consequence of this is that principal component analysis (PCA) may be a poor choice of coarse-graining scheme in multi-relevant systems, since it imposes a notion of scale which is incorrect from the RG perspective. Taken together, our results indicate that RG and PCA can lead to oversimplification when multi-relevance is present and not accounted for.
△ Less
Submitted 7 February, 2024; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Medical Image Deidentification, Cleaning and Compression Using Pylogik
Authors:
Adrienne Kline,
Vinesh Appadurai,
Yuan Luo,
Sanjiv Shah
Abstract:
Leveraging medical record information in the era of big data and machine learning comes with the caveat that data must be cleaned and de-identified. Facilitating data sharing and harmonization for multi-center collaborations are particularly difficult when protected health information (PHI) is contained or embedded in image meta-data. We propose a novel library in the Python framework, called PyLo…
▽ More
Leveraging medical record information in the era of big data and machine learning comes with the caveat that data must be cleaned and de-identified. Facilitating data sharing and harmonization for multi-center collaborations are particularly difficult when protected health information (PHI) is contained or embedded in image meta-data. We propose a novel library in the Python framework, called PyLogik, to help alleviate this issue for ultrasound images, which are particularly challenging because of the frequent inclusion of PHI directly on the images. PyLogik processes the image volumes through a series of text detection/extraction, filtering, thresholding, morphological and contour comparisons. This methodology de-identifies the images, reduces file sizes, and prepares image volumes for applications in deep learning and data sharing. To evaluate its effectiveness in processing ultrasound data, a random sample of 50 cardiac ultrasounds (echocardiograms) were processed through PyLogik, and the outputs were compared with the manual segmentations by an expert user. The Dice coefficient of the two approaches achieved an average value of 0.976. Next, an investigation was conducted to ascertain the degree of information compression achieved using the algorithm. Resultant data was found to be on average ~72% smaller after processing by PyLogik. Our results suggest that PyLogik is a viable methodology for data cleaning and de-identification, determining ROI, and file compression which will facilitate efficient storage, use, and dissemination of ultrasound data. Variants of the pipeline have also been created for use with other medical imaging data types.
△ Less
Submitted 10 May, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Machine Learning Capability: A standardized metric using case difficulty with applications to individualized deployment of supervised machine learning
Authors:
Adrienne Kline,
Joon Lee
Abstract:
Model evaluation is a critical component in supervised machine learning classification analyses. Traditional metrics do not currently incorporate case difficulty. This renders the classification results unbenchmarked for generalization. Item Response Theory (IRT) and Computer Adaptive Testing (CAT) with machine learning can benchmark datasets independent of the end-classification results. This pro…
▽ More
Model evaluation is a critical component in supervised machine learning classification analyses. Traditional metrics do not currently incorporate case difficulty. This renders the classification results unbenchmarked for generalization. Item Response Theory (IRT) and Computer Adaptive Testing (CAT) with machine learning can benchmark datasets independent of the end-classification results. This provides high levels of case-level information regarding evaluation utility. To showcase, two datasets were used: 1) health-related and 2) physical science. For the health dataset a two-parameter IRT model, and for the physical science dataset a polytonomous IRT model, was used to analyze predictive features and place each case on a difficulty continuum. A CAT approach was used to ascertain the algorithms' performance and applicability to new data. This method provides an efficient way to benchmark data, using only a fraction of the dataset (less than 1%) and 22-60x more computationally efficient than traditional metrics. This novel metric, termed Machine Learning Capability (MLC) has additional benefits as it is unbiased to outcome classification and a standardized way to make model comparisons within and across datasets. MLC provides a metric on the limitation of supervised machine learning algorithms. In situations where the algorithm falls short, other input(s) are required for decision-making.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
IRTCI: Item Response Theory for Categorical Imputation
Authors:
Adrienne Kline,
Yuan Luo
Abstract:
Most datasets suffer from partial or complete missing values, which has downstream limitations on the available models on which to test the data and on any statistical inferences that can be made from the data. Several imputation techniques have been designed to replace missing data with stand in values. The various approaches have implications for calculating clinical scores, model building and m…
▽ More
Most datasets suffer from partial or complete missing values, which has downstream limitations on the available models on which to test the data and on any statistical inferences that can be made from the data. Several imputation techniques have been designed to replace missing data with stand in values. The various approaches have implications for calculating clinical scores, model building and model testing. The work showcased here offers a novel means for categorical imputation based on item response theory (IRT) and compares it against several methodologies currently used in the machine learning field including k-nearest neighbors (kNN), multiple imputed chained equations (MICE) and Amazon Web Services (AWS) deep learning method, Datawig. Analyses comparing these techniques were performed on three different datasets that represented ordinal, nominal and binary categories. The data were modified so that they also varied on both the proportion of data missing and the systematization of the missing data. Two different assessments of performance were conducted: accuracy in reproducing the missing values, and predictive performance using the imputed data. Results demonstrated that the new method, Item Response Theory for Categorical Imputation (IRTCI), fared quite well compared to currently used methods, outperforming several of them in many conditions. Given the theoretical basis for the new approach, and the unique generation of probabilistic terms for determining category belonging for missing cells, IRTCI offers a viable alternative to current approaches.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Deep Learning Reveals Patterns of Diverse and Changing Sentiments Towards COVID-19 Vaccines Based on 11 Million Tweets
Authors:
Hanyin Wang,
Meghan R. Hutch,
Yikuan Li,
Adrienne S. Kline,
Sebastian Otero,
Leena B. Mithal,
Emily S. Miller,
Andrew Naidech,
Yuan Luo
Abstract:
Over 12 billion doses of COVID-19 vaccines have been administered at the time of writing. However, public perceptions of vaccines have been complex. We analyzed COVID-19 vaccine-related tweets to understand the evolving perceptions of COVID-19 vaccines. We finetuned a deep learning classifier using a state-of-the-art model, XLNet, to detect each tweet's sentiment automatically. We employed validat…
▽ More
Over 12 billion doses of COVID-19 vaccines have been administered at the time of writing. However, public perceptions of vaccines have been complex. We analyzed COVID-19 vaccine-related tweets to understand the evolving perceptions of COVID-19 vaccines. We finetuned a deep learning classifier using a state-of-the-art model, XLNet, to detect each tweet's sentiment automatically. We employed validated methods to extract the users' race or ethnicity, gender, age, and geographical locations from user profiles. Incorporating multiple data sources, we assessed the sentiment patterns among subpopulations and juxtaposed them against vaccine uptake data to unravel their interactive patterns. 11,211,672 COVID-19 vaccine-related tweets corresponding to 2,203,681 users over two years were analyzed. The finetuned model for sentiment classification yielded an accuracy of 0.92 on testing set. Users from various demographic groups demonstrated distinct patterns in sentiments towards COVID-19 vaccines. User sentiments became more positive over time, upon which we observed subsequent upswing in the population-level vaccine uptake. Surrounding dates where positive sentiments crest, we detected encouraging news or events regarding vaccine development and distribution. Positive sentiments in pregnancy-related tweets demonstrated a delayed pattern compared with trends in general population, with postponed vaccine uptake trends. Distinctive patterns across subpopulations suggest the need of tailored strategies. Global news and events profoundly involved in sha** users' thoughts on social media. Populations with additional concerns, such as pregnancy, demonstrated more substantial hesitancy since lack of timely recommendations. Feature analysis revealed hesitancies of various subpopulations stemmed from clinical trial logics, risks and complications, and urgency of scientific evidence.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Multimodal Machine Learning in Precision Health
Authors:
Adrienne Kline,
Hanyin Wang,
Yikuan Li,
Saya Dennis,
Meghan Hutch,
Zhenxing Xu,
Fei Wang,
Feixiong Cheng,
Yuan Luo
Abstract:
As machine learning and artificial intelligence are more frequently being leveraged to tackle problems in the health sector, there has been increased interest in utilizing them in clinical decision-support. This has historically been the case in single modal data such as electronic health record data. Attempts to improve prediction and resemble the multimodal nature of clinical expert decision-mak…
▽ More
As machine learning and artificial intelligence are more frequently being leveraged to tackle problems in the health sector, there has been increased interest in utilizing them in clinical decision-support. This has historically been the case in single modal data such as electronic health record data. Attempts to improve prediction and resemble the multimodal nature of clinical expert decision-making this has been met in the computational field of machine learning by a fusion of disparate data. This review was conducted to summarize this field and identify topics ripe for future research. We conducted this review in accordance with the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) extension for Sco** Reviews to characterize multi-modal data fusion in health. We used a combination of content analysis and literature searches to establish search strings and databases of PubMed, Google Scholar, and IEEEXplore from 2011 to 2021. A final set of 125 articles were included in the analysis. The most common health areas utilizing multi-modal methods were neurology and oncology. However, there exist a wide breadth of current applications. The most common form of information fusion was early fusion. Notably, there was an improvement in predictive performance performing heterogeneous data fusion. Lacking from the papers were clear clinical deployment strategies and pursuit of FDA-approved tools. These findings provide a map of the current literature on multimodal data fusion as applied to health diagnosis/prognosis problems. Multi-modal machine learning, while more robust in its estimations over unimodal methods, has drawbacks in its scalability and the time-consuming nature of information concatenation.
△ Less
Submitted 10 April, 2022;
originally announced April 2022.
-
An Exploration of Active Learning for Affective Digital Phenoty**
Authors:
Peter Washington,
Cezmi Mutlu,
Aaron Kline,
Cathy Hou,
Kaitlyn Dunlap,
Jack Kent,
Arman Husic,
Nate Stockham,
Brianna Chrisman,
Kelley Paskov,
Jae-Yoon Jung,
Dennis P. Wall
Abstract:
Some of the most severe bottlenecks preventing widespread development of machine learning models for human behavior include a dearth of labeled training data and difficulty of acquiring high quality labels. Active learning is a paradigm for using algorithms to computationally select a useful subset of data points to label using metrics for model uncertainty and data similarity. We explore active l…
▽ More
Some of the most severe bottlenecks preventing widespread development of machine learning models for human behavior include a dearth of labeled training data and difficulty of acquiring high quality labels. Active learning is a paradigm for using algorithms to computationally select a useful subset of data points to label using metrics for model uncertainty and data similarity. We explore active learning for naturalistic computer vision emotion data, a particularly heterogeneous and complex data space due to inherently subjective labels. Using frames collected from gameplay acquired from a therapeutic smartphone game for children with autism, we run a simulation of active learning using gameplay prompts as metadata to aid in the active learning process. We find that active learning using information generated during gameplay slightly outperforms random selection of the same number of labeled frames. We next investigate a method to conduct active learning with subjective data, such as in affective computing, and where multiple crowdsourced labels can be acquired for each image. Using the Child Affective Facial Expression (CAFE) dataset, we simulate an active learning process for crowdsourcing many labels and find that prioritizing frames using the entropy of the crowdsourced label distribution results in lower categorical cross-entropy loss compared to random frame selection. Collectively, these results demonstrate pilot evaluations of two novel active learning approaches for subjective affective data collected in noisy settings.
△ Less
Submitted 6 April, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Challenges and Opportunities for Machine Learning Classification of Behavior and Mental State from Images
Authors:
Peter Washington,
Cezmi Onur Mutlu,
Aaron Kline,
Kelley Paskov,
Nate Tyler Stockham,
Brianna Chrisman,
Nick Deveau,
Mourya Surhabi,
Nick Haber,
Dennis P. Wall
Abstract:
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, a…
▽ More
Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, and training a convolutional neural network, there are several pain points which arise when attempting this process for behavioral phenoty**. Here, we discuss the challenges and corresponding opportunities in this space, including handling heterogeneous data, avoiding biased models, labeling massive and repetitive data sets, working with ambiguous or compound class labels, managing privacy concerns, creating appropriate representations, and personalizing models. We discuss current state-of-the-art research endeavors in CV such as data curation, data augmentation, crowdsourced labeling, active learning, reinforcement learning, generative models, representation learning, federated learning, and meta-learning. We highlight at least some of the machine learning advancements needed for imaging classifiers to detect human social cues successfully and reliably.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning Approach
Authors:
Nathan A. Chi,
Peter Washington,
Aaron Kline,
Arman Husic,
Cathy Hou,
Chloe He,
Kaitlyn Dunlap,
Dennis Wall
Abstract:
Autism spectrum disorder (ASD) is a neurodevelopmental disorder which results in altered behavior, social development, and communication patterns. In past years, autism prevalence has tripled, with 1 in 54 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process, significant attention has been given to develo** systems that automatically screen for autism. Pr…
▽ More
Autism spectrum disorder (ASD) is a neurodevelopmental disorder which results in altered behavior, social development, and communication patterns. In past years, autism prevalence has tripled, with 1 in 54 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process, significant attention has been given to develo** systems that automatically screen for autism. Prosody abnormalities are among the clearest signs of autism, with affected children displaying speech idiosyncrasies including echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. In this work, we present a suite of machine learning approaches to detect autism in self-recorded speech audio captured from autistic and neurotypical (NT) children in home environments. We consider three methods to detect autism in child speech: first, Random Forests trained on extracted audio features (including Mel-frequency cepstral coefficients); second, convolutional neural networks (CNNs) trained on spectrograms; and third, fine-tuned wav2vec 2.0--a state-of-the-art Transformer-based ASR model. We train our classifiers on our novel dataset of cellphone-recorded child speech audio curated from Stanford's Guess What? mobile game, an app designed to crowdsource videos of autistic and neurotypical children in a natural home environment. The Random Forest classifier achieves 70% accuracy, the fine-tuned wav2vec 2.0 model achieves 77% accuracy, and the CNN achieves 79% accuracy when classifying children's audio as either ASD or NT. Our models were able to predict autism status when training on a varied selection of home audio clips with inconsistent recording quality, which may be more generalizable to real world conditions. These results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Training and Profiling a Pediatric Emotion Recognition Classifier on Mobile Devices
Authors:
Agnik Banerjee,
Peter Washington,
Cezmi Mutlu,
Aaron Kline,
Dennis P. Wall
Abstract:
Implementing automated emotion recognition on mobile devices could provide an accessible diagnostic and therapeutic tool for those who struggle to recognize emotion, including children with developmental behavioral conditions such as autism. Although recent advances have been made in building more accurate emotion classifiers, existing models are too computationally expensive to be deployed on mob…
▽ More
Implementing automated emotion recognition on mobile devices could provide an accessible diagnostic and therapeutic tool for those who struggle to recognize emotion, including children with developmental behavioral conditions such as autism. Although recent advances have been made in building more accurate emotion classifiers, existing models are too computationally expensive to be deployed on mobile devices. In this study, we optimized and profiled various machine learning models designed for inference on edge devices and were able to match previous state of the art results for emotion recognition on children. Our best model, a MobileNet-V2 network pre-trained on ImageNet, achieved 65.11% balanced accuracy and 64.19% F1-score on CAFE, while achieving a 45-millisecond inference latency on a Motorola Moto G6 phone. This balanced accuracy is only 1.79% less than the current state of the art for CAFE, which used a model that contains 26.62x more parameters and was unable to run on the Moto G6, even when fully optimized. This work validates that with specialized design and optimization techniques, machine learning models can become lightweight enough for deployment on mobile devices and still achieve high accuracies on difficult image classification tasks.
△ Less
Submitted 21 August, 2021;
originally announced August 2021.
-
Classification of Abnormal Hand Movement for Aiding in Autism Detection: Machine Learning Study
Authors:
Anish Lakkapragada,
Aaron Kline,
Onur Cezmi Mutlu,
Kelley Paskov,
Brianna Chrisman,
Nate Stockham,
Peter Washington,
Dennis Wall
Abstract:
A formal autism diagnosis can be an inefficient and lengthy process. Families may wait months or longer before receiving a diagnosis for their child despite evidence that earlier intervention leads to better treatment outcomes. Digital technologies which detect the presence of behaviors related to autism can scale access to pediatric diagnoses. This work aims to demonstrate the feasibility of deep…
▽ More
A formal autism diagnosis can be an inefficient and lengthy process. Families may wait months or longer before receiving a diagnosis for their child despite evidence that earlier intervention leads to better treatment outcomes. Digital technologies which detect the presence of behaviors related to autism can scale access to pediatric diagnoses. This work aims to demonstrate the feasibility of deep learning technologies for detecting hand flap** from unstructured home videos as a first step towards validating whether models and digital technologies can be leveraged to aid with autism diagnoses. We used the Self-Stimulatory Behavior Dataset (SSBD), which contains 75 videos of hand flap**, head banging, and spinning exhibited by children. From all the hand flap** videos, we extracted 100 positive and control videos of hand flap**, each between 2 to 5 seconds in duration. Utilizing both landmark-driven-approaches and MobileNet V2's pretrained convolutional layers, our highest performing model achieved a testing F1 score of 84% (90% precision and 80% recall) when evaluating with 5-fold cross validation 100 times. This work provides the first step towards develo** precise deep learning methods for activity detection of autism-related behaviors.
△ Less
Submitted 6 June, 2022; v1 submitted 17 August, 2021;
originally announced August 2021.
-
Gaussian Information Bottleneck and the Non-Perturbative Renormalization Group
Authors:
Adam G. Kline,
Stephanie E. Palmer
Abstract:
The renormalization group (RG) is a class of theoretical techniques used to explain the collective physics of interacting, many-body systems. It has been suggested that the RG formalism may be useful in finding and interpreting emergent low-dimensional structure in complex systems outside of the traditional physics context, such as in biology or computer science. In such contexts, one common dimen…
▽ More
The renormalization group (RG) is a class of theoretical techniques used to explain the collective physics of interacting, many-body systems. It has been suggested that the RG formalism may be useful in finding and interpreting emergent low-dimensional structure in complex systems outside of the traditional physics context, such as in biology or computer science. In such contexts, one common dimensionality-reduction framework already in use is information bottleneck (IB), in which the goal is to compress an ``input'' signal $X$ while maximizing its mutual information with some stochastic ``relevance'' variable $Y$. IB has been applied in the vertebrate and invertebrate processing systems to characterize optimal encoding of the future motion of the external world. Other recent work has shown that the RG scheme for the dimer model could be ``discovered'' by a neural network attempting to solve an IB-like problem. This manuscript explores whether IB and any existing formulation of RG are formally equivalent. A class of soft-cutoff non-perturbative RG techniques are defined by families of non-deterministic coarsening maps, and hence can be formally mapped onto IB, and vice versa. For concreteness, this discussion is limited entirely to Gaussian statistics (GIB), for which IB has exact, closed-form solutions. Under this constraint, GIB has a semigroup structure, in which successive transformations remain IB-optimal. Further, the RG cutoff scheme associated with GIB can be identified. Our results suggest that IB can be used to impose a notion of ``large scale'' structure, such as biological function, on an RG procedure.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Activity Recognition with Moving Cameras and Few Training Examples: Applications for Detection of Autism-Related Headbanging
Authors:
Peter Washington,
Aaron Kline,
Onur Cezmi Mutlu,
Emilie Leblanc,
Cathy Hou,
Nate Stockham,
Kelley Paskov,
Brianna Chrisman,
Dennis P. Wall
Abstract:
Activity recognition computer vision algorithms can be used to detect the presence of autism-related behaviors, including what are termed "restricted and repetitive behaviors", or stimming, by diagnostic instruments. The limited data that exist in this domain are usually recorded with a handheld camera which can be shaky or even moving, posing a challenge for traditional feature representation app…
▽ More
Activity recognition computer vision algorithms can be used to detect the presence of autism-related behaviors, including what are termed "restricted and repetitive behaviors", or stimming, by diagnostic instruments. The limited data that exist in this domain are usually recorded with a handheld camera which can be shaky or even moving, posing a challenge for traditional feature representation approaches for activity detection which mistakenly capture the camera's motion as a feature. To address these issues, we first document the advantages and limitations of current feature representation techniques for activity recognition when applied to head banging detection. We then propose a feature representation consisting exclusively of head pose keypoints. We create a computer vision classifier for detecting head banging in home videos using a time-distributed convolutional neural network (CNN) in which a single CNN extracts features from each frame in the input sequence, and these extracted features are fed as input to a long short-term memory (LSTM) network. On the binary task of predicting head banging and no head banging within videos from the Self Stimulatory Behaviour Dataset (SSBD), we reach a mean F1-score of 90.77% using 3-fold cross validation (with individual fold F1-scores of 83.3%, 89.0%, and 100.0%) when ensuring that no child who appeared in the train set was in the test set for all folds. This work documents a successful technique for training a computer vision classifier which can detect human motion with few training examples and even when the camera recording the source clips is unstable. The general methods described here can be applied by designers and developers of interactive systems towards other human motion and pose classification problems used in mobile and ubiquitous interactive systems.
△ Less
Submitted 10 January, 2021;
originally announced January 2021.
-
Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels
Authors:
Peter Washington,
Onur Cezmi Mutlu,
Emilie Leblanc,
Aaron Kline,
Cathy Hou,
Brianna Chrisman,
Nate Stockham,
Kelley Paskov,
Catalin Voss,
Nick Haber,
Dennis Wall
Abstract:
Emotion classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle subjective labels. We explore the use of crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold stan…
▽ More
Emotion classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle subjective labels. We explore the use of crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold standard collection of images depicting pediatric facial expressions along with 100 human labels per image. To test the feasibility of crowdsourcing to generate these labels, we used Microworkers to acquire labels for 207 CAFE images. We evaluate both unfiltered workers as well as workers selected through a short crowd filtration process. We then train two versions of a classifiers on soft-target CAFE labels using the original 100 annotations provided with the dataset: (1) a classifier trained with traditional one-hot encoded labels, and (2) a classifier trained with vector labels representing the distribution of CAFE annotator responses. We compare the resulting softmax output distributions of the two classifiers with a 2-sample independent t-test of L1 distances between the classifier's output probability distribution and the distribution of human labels. While agreement with CAFE is weak for unfiltered crowd workers, the filtered crowd agree with the CAFE labels 100% of the time for many emotions. While the F1-score for a one-hot encoded classifier is much higher (94.33% vs. 78.68%) with respect to the ground truth CAFE labels, the output probability vector of the crowd-trained classifier more closely resembles the distribution of human labels (t=3.2827, p=0.0014). Reporting an emotion probability distribution that accounts for the subjectivity of human interpretation. Crowdsourcing, including a sufficient filtering mechanism, is a feasible solution for acquiring soft-target labels.
△ Less
Submitted 22 September, 2021; v1 submitted 10 January, 2021;
originally announced January 2021.
-
Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study
Authors:
Peter Washington,
Haik Kalantarian,
John Kent,
Arman Husic,
Aaron Kline,
Emilie Leblanc,
Cathy Hou,
Onur Cezmi Mutlu,
Kaitlyn Dunlap,
Yordan Penev,
Maya Varma,
Nate Tyler Stockham,
Brianna Chrisman,
Kelley Paskov,
Min Woo Sun,
Jae-Yoon Jung,
Catalin Voss,
Nick Haber,
Dennis Paul Wall
Abstract:
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emot…
▽ More
Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emotion-enriched images to boost the performance of automatic child emotion recognition models to a level closer to what will be needed for digital health care approaches. Methods: We leveraged our prototype therapeutic smartphone game, GuessWhat, which was designed in large part for children with developmental and behavioral conditions, to gamify the secure collection of video data of children expressing a variety of emotions prompted by the game. Independently, we created a secure web interface to gamify the human labeling effort, called HollywoodSquares, tailored for use by any qualified labeler. We gathered and labeled 2155 videos, 39,968 emotion frames, and 106,001 labels on all images. With this drastically expanded pediatric emotion-centric database (>30 times larger than existing public pediatric emotion data sets), we trained a convolutional neural network (CNN) computer vision classifier of happy, sad, surprised, fearful, angry, disgust, and neutral expressions evoked by children. Results: The classifier achieved a 66.9% balanced accuracy and 67.4% F1-score on the entirety of the Child Affective Facial Expression (CAFE) as well as a 79.1% balanced accuracy and 78% F1-score on CAFE Subset A, a subset containing at least 60% human agreement on emotions labels. This performance is at least 10% higher than all previously developed classifiers evaluated against CAFE, the best of which reached a 56% balanced accuracy even when combining "anger" and "disgust" into a single class.
△ Less
Submitted 3 June, 2024; v1 submitted 15 December, 2020;
originally announced December 2020.
-
A Wearable Social Interaction Aid for Children with Autism
Authors:
Nick Haber,
Catalin Voss,
Jena Daniels,
Peter Washington,
Azar Fazel,
Aaron Kline,
Titas De,
Terry Winograd,
Carl Feinstein,
Dennis P. Wall
Abstract:
With most recent estimates giving an incidence rate of 1 in 68 children in the United States, the autism spectrum disorder (ASD) is a growing public health crisis. Many of these children struggle to make eye contact, recognize facial expressions, and engage in social interactions. Today the standard for treatment of the core autism-related deficits focuses on a form of behavior training known as A…
▽ More
With most recent estimates giving an incidence rate of 1 in 68 children in the United States, the autism spectrum disorder (ASD) is a growing public health crisis. Many of these children struggle to make eye contact, recognize facial expressions, and engage in social interactions. Today the standard for treatment of the core autism-related deficits focuses on a form of behavior training known as Applied Behavioral Analysis. To address perceived deficits in expression recognition, ABA approaches routinely involve the use of prompts such as flash cards for repetitive emotion recognition training via memorization. These techniques must be administered by trained practitioners and often at clinical centers that are far outnumbered by and out of reach from the many children and families in need of attention. Waitlists for access are up to 18 months long, and this wait may lead to children regressing down a path of isolation that worsens their long-term prognosis. There is an urgent need to innovate new methods of care delivery that can appropriately empower caregivers of children at risk or with a diagnosis of autism, and that capitalize on mobile tools and wearable devices for use outside of clinical settings.
△ Less
Submitted 19 April, 2020;
originally announced April 2020.
-
Superpower Glass: Delivering Unobtrusive Real-time Social Cues in Wearable Systems
Authors:
Catalin Voss,
Peter Washington,
Nick Haber,
Aaron Kline,
Jena Daniels,
Azar Fazel,
Titas De,
Beth McCarthy,
Carl Feinstein,
Terry Winograd,
Dennis Wall
Abstract:
We have developed a system for automatic facial expression recognition, which runs on Google Glass and delivers real-time social cues to the wearer. We evaluate the system as a behavioral aid for children with Autism Spectrum Disorder (ASD), who can greatly benefit from real-time non-invasive emotional cues and are more sensitive to sensory input than neurotypically develo** children. In additio…
▽ More
We have developed a system for automatic facial expression recognition, which runs on Google Glass and delivers real-time social cues to the wearer. We evaluate the system as a behavioral aid for children with Autism Spectrum Disorder (ASD), who can greatly benefit from real-time non-invasive emotional cues and are more sensitive to sensory input than neurotypically develo** children. In addition, we present a mobile application that enables users of the wearable aid to review their videos along with auto-curated emotional information on the video playback bar. This integrates our learning aid into the context of behavioral therapy. Expanding on our previous work describing in-lab trials, this paper presents our system and application-level design decisions in depth as well as the interface learnings gathered during the use of the system by multiple children with ASD in an at-home iterative trial.
△ Less
Submitted 16 February, 2020;
originally announced February 2020.
-
Designing a Holistic At-Home Learning Aid for Autism
Authors:
Catalin Voss,
Nick Haber,
Peter Washington,
Aaron Kline,
Beth McCarthy,
Jena Daniels,
Azar Fazel,
Titas De,
Carl Feinstein,
Terry Winograd,
Dennis Wall
Abstract:
In recent years, much focus has been put on employing technology to make novel behavioural aids for those with autism. Most of these are digital adaptations of tools used in standard behavioural therapy to enforce normative skills. These digital counterparts are often used outside of both the larger therapeutic context and the real world, in which the learned skills might apply. To address this, w…
▽ More
In recent years, much focus has been put on employing technology to make novel behavioural aids for those with autism. Most of these are digital adaptations of tools used in standard behavioural therapy to enforce normative skills. These digital counterparts are often used outside of both the larger therapeutic context and the real world, in which the learned skills might apply. To address this, we are designing a system of automatic expression recognition on wearable devices that integrates directly into the families daily social interactions, to give children and their caregivers the tools and information they need to design their own holistic therapy. In order to develop a tool that will be truly useful to families, we proactively include children with autism and their families as co-designers in the development process. By providing an app and interface with interchangeable social feedback options, we aim to produce a framework for therapy that folds into their daily lives, tailored to their specific needs.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Precision measurement of tribocharging in acoustically levitated sub-millimeter grains
Authors:
Adam G. Kline,
Melody Xuan Lim,
Heinrich M. Jaeger
Abstract:
Contact electrification of dielectric grains forms the basis for a myriad of physical phenomena. However, even the basic aspects of collisional charging between grains are still unclear. Here we develop a new experimental method, based on acoustic levitation, which allows us to controllably and repeatedly collide two sub-millimeter grains and measure the evolution of their electric charges. This i…
▽ More
Contact electrification of dielectric grains forms the basis for a myriad of physical phenomena. However, even the basic aspects of collisional charging between grains are still unclear. Here we develop a new experimental method, based on acoustic levitation, which allows us to controllably and repeatedly collide two sub-millimeter grains and measure the evolution of their electric charges. This is therefore the first tribocharging experiment to provide complete electric isolation for the grain-grain system from its surroundings. We use this method to measure collisional charging rates between pairs of grains for three different material combinations: polyethylene-polyethylene, polystyrene-polystyrene, and polystyrene-sulfonated polystyrene. The ability to directly and noninvasively collide particles of different constituent materials, chemical functionality, size, and shape opens the door to detailed studies of collisional charging in granular materials.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
Theories on PHYlogenetic ReconstructioN (PHYRN)
Authors:
Gaurav Bhardwaj,
Zhenhai Zhang,
Yoo** Hong,
Kyung Dae Ko,
Gue Su Chang,
Evan J. Smith,
Lindsay A. Kline,
D. Nicholas Hartranft,
Edward C. Holmes,
Randen L. Patterson,
Damian B. van Rossum
Abstract:
The inability to resolve deep node relationships of highly divergent/rapidly evolving protein families is a major factor that stymies evolutionary studies. In this manuscript, we propose a Multiple Sequence Alignment (MSA) independent method to infer evolutionary relationships. We previously demonstrated that phylogenetic profiles built using position specific scoring matrices (PSSMs) are capabl…
▽ More
The inability to resolve deep node relationships of highly divergent/rapidly evolving protein families is a major factor that stymies evolutionary studies. In this manuscript, we propose a Multiple Sequence Alignment (MSA) independent method to infer evolutionary relationships. We previously demonstrated that phylogenetic profiles built using position specific scoring matrices (PSSMs) are capable of constructing informative evolutionary histories(1;2). In this manuscript, we theorize that PSSMs derived specifically from the query sequences used to construct the phylogenetic tree will improve this method for the study of rapidly evolving proteins. To test this theory, we performed phylogenetic analyses of a benchmark protein superfamily (reverse transcriptases (RT)) as well as simulated datasets. When we compare the results obtained from our method, PHYlogenetic ReconstructioN (PHYRN), with other MSA dependent methods, we observe that PHYRN provides a 4- to 100-fold increase in accurate measurements at deep nodes. As phylogenetic profiles are used as the information source, rather than MSA, we propose PHYRN as a paradigm shift in studying evolution when MSA approaches fail. Perhaps most importantly, due to the improvements in our computational approach and the availability of vast amount of sequencing data, PHYRN is scalable to thousands of sequences. Taken together with PHYRNs adaptability to any protein family, this method can serve as a tool for resolving ambiguities in evolutionary studies of rapidly evolving/highly divergent protein families.
△ Less
Submitted 26 February, 2010; v1 submitted 2 February, 2010;
originally announced February 2010.