Search | arXiv e-print repository

arXiv:2302.08495 [pdf, other]

Parameters, Properties, and Process: Conditional Neural Generation of Realistic SEM Imagery Towards ML-assisted Advanced Manufacturing

Authors: Scott Howland, Lara Kassab, Keerti Kappagantula, Henry Kvinge, Tegan Emerson

Abstract: The research and development cycle of advanced manufacturing processes traditionally requires a large investment of time and resources. Experiments can be expensive and are hence conducted on relatively small scales. This poses problems for typically data-hungry machine learning tools which could otherwise expedite the development cycle. We build upon prior work by applying conditional generative… ▽ More The research and development cycle of advanced manufacturing processes traditionally requires a large investment of time and resources. Experiments can be expensive and are hence conducted on relatively small scales. This poses problems for typically data-hungry machine learning tools which could otherwise expedite the development cycle. We build upon prior work by applying conditional generative adversarial networks (GANs) to scanning electron microscope (SEM) imagery from an emerging manufacturing process, shear assisted processing and extrusion (ShAPE). We generate realistic images conditioned on temper and either experimental parameters or material properties. In doing so, we are able to integrate machine learning into the development cycle, by allowing a user to immediately visualize the microstructure that would arise from particular process parameters or properties. This work forms a technical backbone for a fundamentally new approach for understanding manufacturing processes in the absence of first-principle models. By characterizing microstructure from a topological perspective we are able to evaluate our models' ability to capture the breadth and diversity of experimental scanning electron microscope (SEM) samples. Our method is successful in capturing the visual and general microstructural features arising from the considered process, with analysis highlighting directions to further improve the topological realism of our synthetic imagery. △ Less

Submitted 12 January, 2023; originally announced February 2023.

arXiv:2209.02415 [pdf, other]

Automatic Infectious Disease Classification Analysis with Concept Discovery

Authors: Elena Sizikova, Joshua Vendrow, Xu Cao, Rachel Grotheer, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Thomas Merkh, R. W. M. A. Madushani, Kenny Moise, Annie Ulichney, Huy V. Vo, Chuntian Wang, Megan Coffee, Kathryn Leonard, Deanna Needell

Abstract: Automatic infectious disease classification from images can facilitate needed medical diagnoses. Such an approach can identify diseases, like tuberculosis, which remain under-diagnosed due to resource constraints and also novel and emerging diseases, like monkeypox, which clinicians have little experience or acumen in diagnosing. Avoiding missed or delayed diagnoses would prevent further transmiss… ▽ More Automatic infectious disease classification from images can facilitate needed medical diagnoses. Such an approach can identify diseases, like tuberculosis, which remain under-diagnosed due to resource constraints and also novel and emerging diseases, like monkeypox, which clinicians have little experience or acumen in diagnosing. Avoiding missed or delayed diagnoses would prevent further transmission and improve clinical outcomes. In order to understand and trust neural network predictions, analysis of learned representations is necessary. In this work, we argue that automatic discovery of concepts, i.e., human interpretable attributes, allows for a deep understanding of learned information in medical image analysis tasks, generalizing beyond the training labels or protocols. We provide an overview of existing concept discovery approaches in medical image and computer vision communities, and evaluate representative methods on tuberculosis (TB) prediction and monkeypox prediction tasks. Finally, we propose NMFx, a general NMF formulation of interpretability by concept discovery that works in a unified way in unsupervised, weakly supervised, and supervised scenarios. △ Less

Submitted 14 November, 2022; v1 submitted 28 August, 2022; originally announced September 2022.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 13 pages

arXiv:2204.00629 [pdf, other]

TopTemp: Parsing Precipitate Structure from Temper Topology

Authors: Lara Kassab, Scott Howland, Henry Kvinge, Keerti Sahithi Kappagantula, Tegan Emerson

Abstract: Technological advances are in part enabled by the development of novel manufacturing processes that give rise to new materials or material property improvements. Development and evaluation of new manufacturing methodologies is labor-, time-, and resource-intensive expensive due to complex, poorly defined relationships between advanced manufacturing process parameters and the resulting microstructu… ▽ More Technological advances are in part enabled by the development of novel manufacturing processes that give rise to new materials or material property improvements. Development and evaluation of new manufacturing methodologies is labor-, time-, and resource-intensive expensive due to complex, poorly defined relationships between advanced manufacturing process parameters and the resulting microstructures. In this work, we present a topological representation of temper (heat-treatment) dependent material micro-structure, as captured by scanning electron microscopy, called TopTemp. We show that this topological representation is able to support temper classification of microstructures in a data limited setting, generalizes well to previously unseen samples, is robust to image perturbations, and captures domain interpretable features. The presented work outperforms conventional deep learning baselines and is a first step towards improving understanding of process parameters and resulting material properties. △ Less

Submitted 6 May, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

MSC Class: 55N31 (Primary)

arXiv:2203.03551 [pdf, other]

Semi-supervised Nonnegative Matrix Factorization for Document Classification

Authors: Jamie Haddock, Lara Kassab, Sixian Li, Alona Kryshchenko, Rachel Grotheer, Elena Sizikova, Chuntian Wang, Thomas Merkh, RWMA Madushani, Miju Ahn, Deanna Needell, Kathryn Leonard

Abstract: We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification and provide motivation for these models as maximum likelihood estimators. The proposed SSNMF models simultaneously provide both a topic model and a model for classification, thereby offering highly interpretable classification results. We derive training methods using multiplicative updates f… ▽ More We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification and provide motivation for these models as maximum likelihood estimators. The proposed SSNMF models simultaneously provide both a topic model and a model for classification, thereby offering highly interpretable classification results. We derive training methods using multiplicative updates for each new model, and demonstrate the application of these models to single-label and multi-label document classification, although the models are flexible to other supervised learning tasks such as regression. We illustrate the promise of these models and training methods on document classification datasets (e.g., 20 Newsgroups, Reuters). △ Less

Submitted 28 February, 2022; originally announced March 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2010.07956

arXiv:2010.07956 [pdf, other]

Semi-supervised NMF Models for Topic Modeling in Learning Tasks

Authors: Jamie Haddock, Lara Kassab, Sixian Li, Alona Kryshchenko, Rachel Grotheer, Elena Sizikova, Chuntian Wang, Thomas Merkh, R. W. M. A. Madushani, Miju Ahn, Deanna Needell, Kathryn Leonard

Abstract: We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learni… ▽ More We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learning tasks. We illustrate the promise of these models and training methods on both synthetic and real data, and achieve high classification accuracy on the 20 Newsgroups dataset. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: 4 figures, 12 tables

arXiv:2010.01600 [pdf, other]

Sparseness-constrained Nonnegative Tensor Factorization for Detecting Topics at Different Time Scales

Authors: Lara Kassab, Alona Kryshchenko, Hanbaek Lyu, Denali Molitor, Deanna Needell, Elizaveta Rebrova, Jiahong Yuan

Abstract: Temporal data (such as news articles or Twitter feeds) often consists of a mixture of long-lasting trends and popular but short-lasting topics of interest. A truly successful topic modeling strategy should be able to detect both types of topics and clearly locate them in time. In this paper, we first show that nonnegative CANDECOMP/PARAFAC decomposition (NCPD) is able to discover topics of variabl… ▽ More Temporal data (such as news articles or Twitter feeds) often consists of a mixture of long-lasting trends and popular but short-lasting topics of interest. A truly successful topic modeling strategy should be able to detect both types of topics and clearly locate them in time. In this paper, we first show that nonnegative CANDECOMP/PARAFAC decomposition (NCPD) is able to discover topics of variable persistence automatically. Then, we propose sparseness-constrained NCPD (S-NCPD) and its online variant in order to actively control the length of the learned topics effectively and efficiently. Further, we propose quantitative ways to measure the topic length and demonstrate the ability of S-NCPD (as well as its online variant) to discover short and long-lasting temporal topics in a controlled manner in semi-synthetic and real-world data including news headlines. We also demonstrate that the online variant of S-NCPD reduces the reconstruction error more rapidly than S-NCPD. △ Less

Submitted 31 August, 2023; v1 submitted 4 October, 2020; originally announced October 2020.

arXiv:2001.00631 [pdf, other]

On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition

Authors: Miju Ahn, Nicole Eikmeier, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Kathryn Leonard, Deanna Needell, R. W. M. A. Madushani, Elena Sizikova, Chuntian Wang

Abstract: There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of t… ▽ More There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of the data tensor are each factorized into the product of lower-dimensional nonnegative matrices. With this approach, however, information contained in the temporal dimension of the data is often neglected or underutilized. To overcome this issue, we propose instead adopting the method of nonnegative CANDECOMP/PARAPAC (CP) tensor decomposition (NNCPD), where the data tensor is directly decomposed into a minimal sum of outer products of nonnegative vectors, thereby preserving the temporal information. The viability of NNCPD is demonstrated through application to both synthetic and real data, where significantly improved results are obtained compared to those of typical NMF-based methods. The advantages of NNCPD over such approaches are studied and discussed. To the best of our knowledge, this is the first time that NNCPD has been utilized for the purpose of dynamic topic modeling, and our findings will be transformative for both applications and further developments. △ Less

Submitted 14 October, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

Comments: 23 pages, 29 figures, submitted to Women in Data Science and Mathematics (WiSDM) Workshop Proceedings, "Advances in Data Science", AWM-Springer series

arXiv:1812.00875 [pdf, other]

doi 10.1016/j.patrec.2019.11.029

A torus model for optical flow

Authors: Henry Adams, Johnathan Bush, Brittany Carr, Lara Kassab, Joshua Mirth

Abstract: We propose a torus model for high-contrast patches of optical flow. Our model is derived from a database of ground-truth optical flow from the computer-generated video \emph{Sintel}, collected by Butler et al.\ in \emph{A naturalistic open source movie for optical flow evaluation}. Using persistent homology and zigzag persistence, popular tools from the field of computational topology, we show tha… ▽ More We propose a torus model for high-contrast patches of optical flow. Our model is derived from a database of ground-truth optical flow from the computer-generated video \emph{Sintel}, collected by Butler et al.\ in \emph{A naturalistic open source movie for optical flow evaluation}. Using persistent homology and zigzag persistence, popular tools from the field of computational topology, we show that the high-contrast $3\times 3$ patches from this video are well-modeled by a \emph{torus}, a nonlinear 2-dimensional manifold. Furthermore, we show that the optical flow torus model is naturally equipped with the structure of a fiber bundle, related to the statistics of range image patches. △ Less

Submitted 24 November, 2019; v1 submitted 9 November, 2018; originally announced December 2018.

Journal ref: Pattern Recognition Letters 129 (2020), 304-310

Showing 1–8 of 8 results for author: Kassab, L