Search | arXiv e-print repository

Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

Authors: Yizhen Zhong, Jiajie Xiao, Thomas Vetterli, Mahan Matin, Ellen Loo, Jimmy Lin, Richard Bourgon, Ofer Shapira

Abstract: The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in develo** diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP… ▽ More The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in develo** diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP models to perform the CRC phenoty**, with the goal of extracting precancerous lesion attributes and distinguishing cancer and precancerous cases. We achieved 0.914 macro-F1 scores for classifying patients into negative, non-advanced adenoma, advanced adenoma and CRC. We further improved the performance to 0.923 using an ensemble of classifiers for cancer status classification and lesion size named entity recognition (NER). Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention. △ Less

Submitted 9 December, 2022; originally announced December 2022.

arXiv:2210.00508 [pdf, ps, other]

doi 10.37236/11659

The lexicographically least square-free word with a given prefix

Authors: Siddharth Berera, Andrés Gómez-Colunga, Joey Lakerdas-Gayle, John López, Mauditra Matin, Daniel Roebuck, Eric Rowland, Noam Scully, Juliet Whidden

Abstract: The lexicographically least square-free infinite word on the alphabet of non-negative integers with a given prefix $p$ is denoted $L(p)$. When $p$ is the empty word, this word was shown by Guay-Paquet and Shallit to be the ruler sequence. For other prefixes, the structure is significantly more complicated. In this paper, we show that $L(p)$ reflects the structure of the ruler sequence for several… ▽ More The lexicographically least square-free infinite word on the alphabet of non-negative integers with a given prefix $p$ is denoted $L(p)$. When $p$ is the empty word, this word was shown by Guay-Paquet and Shallit to be the ruler sequence. For other prefixes, the structure is significantly more complicated. In this paper, we show that $L(p)$ reflects the structure of the ruler sequence for several words $p$. We provide morphisms that generate $L(n)$ for letters $n=1$ and $n\geq3$, and $L(p)$ for most families of two-letter words $p$. △ Less

Submitted 2 November, 2022; v1 submitted 2 October, 2022; originally announced October 2022.

Journal ref: The Electronic Journal of Combinatorics 30 (2023) #P3.11 (43 pages)

arXiv:2012.05013 [pdf, other]

Machine Learning for Glacier Monitoring in the Hindu Kush Himalaya

Authors: Shimaa Baraka, Benjamin Akera, Bibek Aryal, Tenzing Sherpa, Finu Shresta, Anthony Ortiz, Kris Sankaran, Juan Lavista Ferres, Mir Matin, Yoshua Bengio

Abstract: Glacier map** is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated map** from satellite images. We utilize readily available remote se… ▽ More Glacier map** is key to ecological monitoring in the hkh region. Climate change poses a risk to individuals whose livelihoods depend on the health of glacier ecosystems. In this work, we present a machine learning based approach to support ecological monitoring, with a focus on glaciers. Our approach is based on semi-automated map** from satellite images. We utilize readily available remote sensing data to create a model to identify and outline both clean ice and debris-covered glaciers from satellite imagery. We also release data and develop a web tool that allows experts to visualize and correct model predictions, with the ultimate aim of accelerating the glacier map** process. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Comments: Accepted for a spotlight talk and a poster at the Tackling Climate Change with Machine Learning workshop at NeurIPS 2020

arXiv:2008.07426 [pdf, other]

Hey Human, If your Facial Emotions are Uncertain, You Should Use Bayesian Neural Networks!

Authors: Maryam Matin, Matias Valdenegro-Toro

Abstract: Facial emotion recognition is the task to classify human emotions in face images. It is a difficult task due to high aleatoric uncertainty and visual ambiguity. A large part of the literature aims to show progress by increasing accuracy on this task, but this ignores the inherent uncertainty and ambiguity in the task. In this paper we show that Bayesian Neural Networks, as approximated using MC-Dr… ▽ More Facial emotion recognition is the task to classify human emotions in face images. It is a difficult task due to high aleatoric uncertainty and visual ambiguity. A large part of the literature aims to show progress by increasing accuracy on this task, but this ignores the inherent uncertainty and ambiguity in the task. In this paper we show that Bayesian Neural Networks, as approximated using MC-Dropout, MC-DropConnect, or an Ensemble, are able to model the aleatoric uncertainty in facial emotion recognition, and produce output probabilities that are closer to what a human expects. We also show that calibration metrics show strange behaviors for this task, due to the multiple classes that can be considered correct, which motivates future work. We believe our work will motivate other researchers to move away from Classical and into Bayesian Neural Networks. △ Less

Submitted 17 August, 2020; originally announced August 2020.

Comments: 10 pages, 7 figures, Women in Computer Vision @ ECCV 2020 camera ready

Showing 1–4 of 4 results for author: Matin, M