-
Improving Precancerous Case Characterization via Transformer-based Ensemble Learning
Authors:
Yizhen Zhong,
Jiajie Xiao,
Thomas Vetterli,
Mahan Matin,
Ellen Loo,
Jimmy Lin,
Richard Bourgon,
Ofer Shapira
Abstract:
The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in develo** diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP…
▽ More
The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in develo** diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenoty**, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
The lexicographically least square-free word with a given prefix
Authors:
Siddharth Berera,
Andrés Gómez-Colunga,
Joey Lakerdas-Gayle,
John López,
Mauditra Matin,
Daniel Roebuck,
Eric Rowland,
Noam Scully,
Juliet Whidden
Abstract:
The lexicographically least square-free infinite word on the alphabet of non-negative integers with a given prefix $p$ is denoted $L(p)$. When $p$ is the empty word, this word was shown by Guay-Paquet and Shallit to be the ruler sequence. For other prefixes, the structure is significantly more complicated. In this paper, we show that $L(p)$ reflects the structure of the ruler sequence for several…
▽ More
The lexicographically least square-free infinite word on the alphabet of non-negative integers with a given prefix $p$ is denoted $L(p)$. When $p$ is the empty word, this word was shown by Guay-Paquet and Shallit to be the ruler sequence. For other prefixes, the structure is significantly more complicated. In this paper, we show that $L(p)$ reflects the structure of the ruler sequence for several words $p$. We provide morphisms that generate $L(n)$ for letters $n=1$ and $n\geq3$, and $L(p)$ for most families of two-letter words $p$.
△ Less
Submitted 2 November, 2022; v1 submitted 2 October, 2022;
originally announced October 2022.
-
Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya
Authors:
Shimaa Baraka,
Benjamin Akera,
Bibek Aryal,
Tenzing Sherpa,
Finu Shresta,
Anthony Ortiz,
Kris Sankaran,
Juan Lavista Ferres,
Mir Matin,
Yoshua Bengio
Abstract:
Glacier map** is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated map** from satellite images. We utilize readily available remote se…
▽ More
Glacier map** is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated map** from satellite images. We utilize readily available remote sensing data to create a model to identify and outline both clean ice and debris-covered glaciers from satellite imagery. We also release data and develop a web tool that allows experts to visualize and correct model predictions, with the ultimate aim of accelerating the glacier map** process.
△ Less
Submitted 9 December, 2020;
originally announced December 2020.
-
Hey Human, If your Facial Emotions are Uncertain, You Should Use Bayesian Neural Networks!
Authors:
Maryam Matin,
Matias Valdenegro-Toro
Abstract:
Facial emotion recognition is the task to classify human emotions in face images. It is a difficult task due to high aleatoric uncertainty and visual ambiguity. A large part of the literature aims to show progress by increasing accuracy on this task, but this ignores the inherent uncertainty and ambiguity in the task. In this paper we show that Bayesian Neural Networks, as approximated using MC-Dr…
▽ More
Facial emotion recognition is the task to classify human emotions in face images. It is a difficult task due to high aleatoric uncertainty and visual ambiguity. A large part of the literature aims to show progress by increasing accuracy on this task, but this ignores the inherent uncertainty and ambiguity in the task. In this paper we show that Bayesian Neural Networks, as approximated using MC-Dropout, MC-DropConnect, or an Ensemble, are able to model the aleatoric uncertainty in facial emotion recognition, and produce output probabilities that are closer to what a human expects. We also show that calibration metrics show strange behaviors for this task, due to the multiple classes that can be considered correct, which motivates future work. We believe our work will motivate other researchers to move away from Classical and into Bayesian Neural Networks.
△ Less
Submitted 17 August, 2020;
originally announced August 2020.