-
3D-Morphomics, Morphological Features on CT scans for lung nodule malignancy diagnosis
Authors:
Elias Munoz,
Pierre Baudot,
Van-Khoa Le,
Charles Voyton,
Benjamin Renoust,
Danny Francis,
Vladimir Groza,
Jean-Christophe Brisset,
Ezequiel Geremia,
Antoine Iannessi,
Yan Liu,
Benoit Huet
Abstract:
Pathologies systematically induce morphological changes, thus providing a major but yet insufficiently quantified source of observables for diagnosis. The study develops a predictive model of the pathological states based on morphological features (3D-morphomics) on Computed Tomography (CT) volumes. A complete workflow for mesh extraction and simplification of an organ's surface is developed, and…
▽ More
Pathologies systematically induce morphological changes, thus providing a major but yet insufficiently quantified source of observables for diagnosis. The study develops a predictive model of the pathological states based on morphological features (3D-morphomics) on Computed Tomography (CT) volumes. A complete workflow for mesh extraction and simplification of an organ's surface is developed, and coupled with an automatic extraction of morphological features given by the distribution of mean curvature and mesh energy. An XGBoost supervised classifier is then trained and tested on the 3D-morphomics to predict the pathological states. This framework is applied to the prediction of the malignancy of lung's nodules. On a subset of NLST database with malignancy confirmed biopsy, using 3D-morphomics only, the classification model of lung nodules into malignant vs. benign achieves 0.964 of AUC. Three other sets of classical features are trained and tested, (1) clinical relevant features gives an AUC of 0.58, (2) 111 radiomics gives an AUC of 0.976, (3) radiologist ground truth (GT) containing the nodule size, attenuation and spiculation qualitative annotations gives an AUC of 0.979. We also test the Brock model and obtain an AUC of 0.826. Combining 3D-morphomics and radiomics features achieves state-of-the-art results with an AUC of 0.978 where the 3D-morphomics have some of the highest predictive powers. As a validation on a public independent cohort, models are applied to the LIDC dataset, the 3D-morphomics achieves an AUC of 0.906 and the 3D-morphomics+radiomics achieves an AUC of 0.958, which ranks second in the challenge among deep models. It establishes the curvature distributions as efficient features for predicting lung nodule malignancy and a new method that can be applied directly to arbitrary computer aided diagnosis task.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
UniMorph 4.0: Universal Morphology
Authors:
Khuyagbaatar Batsuren,
Omer Goldman,
Salam Khalifa,
Nizar Habash,
Witold Kieraś,
Gábor Bella,
Brian Leonard,
Garrett Nicolai,
Kyle Gorman,
Yustinus Ghanggo Ate,
Maria Ryskina,
Sabrina J. Mielke,
Elena Budianskaya,
Charbel El-Khaissi,
Tiago Pimentel,
Michael Gasser,
William Lane,
Mohit Raj,
Matt Coler,
Jaime Rafael Montoya Samame,
Delio Siticonatzi Camaiteri,
Benoît Sagot,
Esaú Zumaeta Rojas,
Didier López Francis,
Arturo Oncevay
, et al. (71 additional authors not shown)
Abstract:
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This pa…
▽ More
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This paper presents the expansions and improvements made on several fronts over the last couple of years (since McCarthy et al. (2020)). Collaborative efforts by numerous linguists have added 67 new languages, including 30 endangered languages. We have implemented several improvements to the extraction pipeline to tackle some issues, e.g. missing gender and macron information. We have also amended the schema to use a hierarchical structure that is needed for morphological phenomena like multiple-argument agreement and case stacking, while adding some missing morphological features to make the schema more inclusive. In light of the last UniMorph release, we also augmented the database with morpheme segmentation for 16 languages. Lastly, this new release makes a push towards inclusion of derivational morphology in UniMorph by enriching the data and annotation schema with instances representing derivational processes from MorphyNet.
△ Less
Submitted 19 June, 2022; v1 submitted 7 May, 2022;
originally announced May 2022.
-
Strategyproof and Proportional Chore Division for Piecewise Uniform Preferences
Authors:
David Francis
Abstract:
Chore division is the problem of fairly dividing some divisible, undesirable bad, such as a set of chores, among a number of players. Each player has their own valuation of the chores, and must be satisfied they did not receive more than their fair share. In this paper, I consider the problem of strategyproof chore division, in which the algorithm must ensure that each player cannot benefit from m…
▽ More
Chore division is the problem of fairly dividing some divisible, undesirable bad, such as a set of chores, among a number of players. Each player has their own valuation of the chores, and must be satisfied they did not receive more than their fair share. In this paper, I consider the problem of strategyproof chore division, in which the algorithm must ensure that each player cannot benefit from mis-representing their position. I present an algorithm that performs proportional and strategyproof chore division for any number of players given piecewise uniform valuation functions.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Quantifying Cognitive Factors in Lexical Decline
Authors:
David Francis,
Ella Rabinovich,
Farhan Samir,
David Mortensen,
Suzanne Stevenson
Abstract:
We adopt an evolutionary view on language change in which cognitive factors (in addition to social ones) affect the fitness of words and their success in the linguistic ecosystem. Specifically, we propose a variety of psycholinguistic factors -- semantic, distributional, and phonological -- that we hypothesize are predictive of lexical decline, in which words greatly decrease in frequency over tim…
▽ More
We adopt an evolutionary view on language change in which cognitive factors (in addition to social ones) affect the fitness of words and their success in the linguistic ecosystem. Specifically, we propose a variety of psycholinguistic factors -- semantic, distributional, and phonological -- that we hypothesize are predictive of lexical decline, in which words greatly decrease in frequency over time. Using historical data across three languages (English, French, and German), we find that most of our proposed factors show a significant difference in the expected direction between each curated set of declining words and their matched stable words. Moreover, logistic regression analyses show that semantic and distributional factors are significant in predicting declining words. Further diachronic analysis reveals that declining words tend to decrease in the diversity of their lexical contexts over time, gradually narrowing their 'ecological niches'.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
Empirical Evaluation of Kernel PCA Approximation Methods in Classification Tasks
Authors:
Deena P. Francis,
Kumudha Raimond
Abstract:
Kernel Principal Component Analysis (KPCA) is a popular dimensionality reduction technique with a wide range of applications. However, it suffers from the problem of poor scalability. Various approximation methods have been proposed in the past to overcome this problem. The Nyström method, Randomized Nonlinear Component Analysis (RNCA) and Streaming Kernel Principal Component Analysis (SKPCA) were…
▽ More
Kernel Principal Component Analysis (KPCA) is a popular dimensionality reduction technique with a wide range of applications. However, it suffers from the problem of poor scalability. Various approximation methods have been proposed in the past to overcome this problem. The Nyström method, Randomized Nonlinear Component Analysis (RNCA) and Streaming Kernel Principal Component Analysis (SKPCA) were proposed to deal with the scalability issue of KPCA. Despite having theoretical guarantees, their performance in real world learning tasks have not been explored previously. In this work the evaluation of SKPCA, RNCA and Nyström method for the task of classification is done for several real world datasets. The results obtained indicate that SKPCA based features gave much better classification accuracy when compared to the other methods for a very large dataset.
△ Less
Submitted 12 December, 2017;
originally announced December 2017.