Search | arXiv e-print repository

Analysis of Spatial and Spatiotemporal Anomalies Using Persistent Homology: Case Studies with COVID-19 Data

Authors: Abigail Hickok, Deanna Needell, Mason A. Porter

Abstract: We develop a method for analyzing spatial and spatiotemporal anomalies in geospatial data using topological data analysis (TDA). To do this, we use persistent homology (PH), which allows one to algorithmically detect geometric voids in a data set and quantify the persistence of such voids. We construct an efficient filtered simplicial complex (FSC) such that the voids in our FSC are in one-to-one… ▽ More We develop a method for analyzing spatial and spatiotemporal anomalies in geospatial data using topological data analysis (TDA). To do this, we use persistent homology (PH), which allows one to algorithmically detect geometric voids in a data set and quantify the persistence of such voids. We construct an efficient filtered simplicial complex (FSC) such that the voids in our FSC are in one-to-one correspondence with the anomalies. Our approach goes beyond simply identifying anomalies; it also encodes information about the relationships between anomalies. We use vineyards, which one can interpret as time-varying persistence diagrams (which are an approach for visualizing PH), to track how the locations of the anomalies change with time. We conduct two case studies using spatially heterogeneous COVID-19 data. First, we examine vaccination rates in New York City by zip code at a single point in time. Second, we study a year-long data set of COVID-19 case rates in neighborhoods of the city of Los Angeles. △ Less

Submitted 24 February, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

Comments: revised version

MSC Class: 55N31; 68T09; 92D30

arXiv:2105.10598 [pdf, other]

Embracing New Techniques in Deep Learning for Estimating Image Memorability

Authors: Coen D. Needell, Wilma A. Bainbridge

Abstract: Various work has suggested that the memorability of an image is consistent across people, and thus can be treated as an intrinsic property of an image. Using computer vision models, we can make specific predictions about what people will remember or forget. While older work has used now-outdated deep learning architectures to predict image memorability, innovations in the field have given us new t… ▽ More Various work has suggested that the memorability of an image is consistent across people, and thus can be treated as an intrinsic property of an image. Using computer vision models, we can make specific predictions about what people will remember or forget. While older work has used now-outdated deep learning architectures to predict image memorability, innovations in the field have given us new techniques to apply to this problem. Here, we propose and evaluate five alternative deep learning models which exploit developments in the field from the last five years, largely the introduction of residual neural networks, which are intended to allow the model to use semantic information in the memorability estimation process. These new models were tested against the prior state of the art with a combined dataset built to optimize both within-category and across-category predictions. Our findings suggest that the key prior memorability network had overstated its generalizability and was overfit on its training set. Our new models outperform this prior model, leading us to conclude that Residual Networks outperform simpler convolutional neural networks in memorability regression. We make our new state-of-the-art model readily available to the research community, allowing memory researchers to make predictions about memorability on a wider range of images. △ Less

Submitted 8 January, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

Comments: 27 pages, 15 figures, Presented at the Proceedings of the Vision Sciences Society 2021

ACM Class: J.4; I.2.10

arXiv:2105.09065 [pdf, other]

Statistical Learning for Best Practices in Tattoo Removal

Authors: Richard Yim, Jamie Haddock, Deanna Needell

Abstract: The causes behind complications in laser-assisted tattoo removal are currently not well understood, and in the literature relating to tattoo removal the emphasis on removal treatment is on removal technologies and tools, not best parameters involved in the treatment process. Additionally, the very challenge of determining best practices is difficult given the complexity of interactions between fac… ▽ More The causes behind complications in laser-assisted tattoo removal are currently not well understood, and in the literature relating to tattoo removal the emphasis on removal treatment is on removal technologies and tools, not best parameters involved in the treatment process. Additionally, the very challenge of determining best practices is difficult given the complexity of interactions between factors that may correlate to these complications. In this paper we apply a battery of classical statistical methods and techniques to identify features that may be closely correlated to causes of complication during the tattoo removal process, and report quantitative evidence for potential best practices. We develop elementary statistical descriptions of tattoo data collected by the largest gang rehabilitation and reentry organization in the world, Homeboy Industries; perform parametric and nonparametric tests of significance; and finally, produce a statistical model explaining treatment parameter interactions, as well as develop a ranking system for treatment parameters utilizing bootstrap** and gradient boosting. △ Less

Submitted 19 May, 2021; originally announced May 2021.

Comments: 15 pages, 2 figures, 9 tables

arXiv:2104.14028 [pdf, other]

Analysis of Legal Documents via Non-negative Matrix Factorization Methods

Authors: Ryan Budahazy, Lu Cheng, Yihuan Huang, Andrew Johnson, Pengyu Li, Joshua Vendrow, Zhoutong Wu, Denali Molitor, Elizaveta Rebrova, Deanna Needell

Abstract: The California Innocence Project (CIP), a clinical law school program aiming to free wrongfully convicted prisoners, evaluates thousands of mails containing new requests for assistance and corresponding case files. Processing and interpreting this large amount of information presents a significant challenge for CIP officials, which can be successfully aided by topic modeling techniques.In this pap… ▽ More The California Innocence Project (CIP), a clinical law school program aiming to free wrongfully convicted prisoners, evaluates thousands of mails containing new requests for assistance and corresponding case files. Processing and interpreting this large amount of information presents a significant challenge for CIP officials, which can be successfully aided by topic modeling techniques.In this paper, we apply Non-negative Matrix Factorization (NMF) method and implement various offshoots of it to the important and previously unstudied data set compiled by CIP. We identify underlying topics of existing case files and classify request files by crime type and case status (decision type). The results uncover the semantic structure of current case files and can provide CIP officials with a general understanding of newly received case files before further examinations. We also provide an exposition of popular variants of NMF with their experimental results and discuss the benefits and drawbacks of each variant through the real-world application. △ Less

Submitted 6 November, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

Comments: 16 pages, 4 figures

arXiv:2103.11037 [pdf, other]

Mode-wise Tensor Decompositions: Multi-dimensional Generalizations of CUR Decompositions

Authors: HanQin Cai, Keaton Hamm, Longxiu Huang, Deanna Needell

Abstract: Low rank tensor approximation is a fundamental tool in modern machine learning and data science. In this paper, we study the characterization, perturbation analysis, and an efficient sampling strategy for two primary tensor CUR approximations, namely Chidori and Fiber CUR. We characterize exact tensor CUR decompositions for low multilinear rank tensors. We also present theoretical error bounds of… ▽ More Low rank tensor approximation is a fundamental tool in modern machine learning and data science. In this paper, we study the characterization, perturbation analysis, and an efficient sampling strategy for two primary tensor CUR approximations, namely Chidori and Fiber CUR. We characterize exact tensor CUR decompositions for low multilinear rank tensors. We also present theoretical error bounds of the tensor CUR approximations when (adversarial or Gaussian) noise appears. Moreover, we show that low cost uniform sampling is sufficient for tensor CUR approximations if the tensor has an incoherent structure. Empirical performance evaluations, with both synthetic and real-world datasets, establish the speed advantage of the tensor CUR approximations over other state-of-the-art low multilinear rank tensor approximations. △ Less

Submitted 25 June, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

Journal ref: The Journal of Machine Learning Research 22.185 (2021): 1-36

arXiv:2101.05231 [pdf, other]

doi 10.1137/20M1388322

Robust CUR Decomposition: Theory and Imaging Applications

Authors: HanQin Cai, Keaton Hamm, Longxiu Huang, Deanna Needell

Abstract: This paper considers the use of Robust PCA in a CUR decomposition framework and applications thereof. Our main algorithms produce a robust version of column-row factorizations of matrices $\mathbf{D}=\mathbf{L}+\mathbf{S}$ where $\mathbf{L}$ is low-rank and $\mathbf{S}$ contains sparse outliers. These methods yield interpretable factorizations at low computational cost, and provide new CUR decompo… ▽ More This paper considers the use of Robust PCA in a CUR decomposition framework and applications thereof. Our main algorithms produce a robust version of column-row factorizations of matrices $\mathbf{D}=\mathbf{L}+\mathbf{S}$ where $\mathbf{L}$ is low-rank and $\mathbf{S}$ contains sparse outliers. These methods yield interpretable factorizations at low computational cost, and provide new CUR decompositions that are robust to sparse outliers, in contrast to previous methods. We consider two key imaging applications of Robust PCA: video foreground-background separation and face modeling. This paper examines the qualitative behavior of our Robust CUR decompositions on the benchmark videos and face datasets, and find that our method works as well as standard Robust PCA while being significantly faster. Additionally, we consider hybrid randomized and deterministic sampling methods which produce a compact CUR decomposition of a given matrix, and apply this to video sequences to produce canonical frames thereof. △ Less

Submitted 5 August, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

MSC Class: 15A23; 65F30; 68P20; 68W20; 68W25; 68Q25

Journal ref: SIAM Journal on Imaging Sciences 14.4 (2021): 1472-1503

arXiv:2011.05384 [pdf, other]

Applications of Online Nonnegative Matrix Factorization to Image and Time-Series Data

Authors: Hanbaek Lyu, Georg Menz, Deanna Needell, Christopher Strohmeier

Abstract: Online nonnegative matrix factorization (ONMF) is a matrix factorization technique in the online setting where data are acquired in a streaming fashion and the matrix factors are updated each time. This enables factor analysis to be performed concurrently with the arrival of new data samples. In this article, we demonstrate how one can use online nonnegative matrix factorization algorithms to lear… ▽ More Online nonnegative matrix factorization (ONMF) is a matrix factorization technique in the online setting where data are acquired in a streaming fashion and the matrix factors are updated each time. This enables factor analysis to be performed concurrently with the arrival of new data samples. In this article, we demonstrate how one can use online nonnegative matrix factorization algorithms to learn joint dictionary atoms from an ensemble of correlated data sets. We propose a temporal dictionary learning scheme for time-series data sets, based on ONMF algorithms. We demonstrate our dictionary learning technique in the application contexts of historical temperature data, video frames, and color images. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: 9 pages, 8 figures

Journal ref: 2020 Information Theory and Applications Workshop (ITA)

arXiv:2010.11365 [pdf, other]

On a Guided Nonnegative Matrix Factorization

Authors: Joshua Vendrow, Jamie Haddock, Elizaveta Rebrova, Deanna Needell

Abstract: Fully unsupervised topic models have found fantastic success in document clustering and classification. However, these models often suffer from the tendency to learn less-than-meaningful or even redundant topics when the data is biased towards a set of features. For this reason, we propose an approach based upon the nonnegative matrix factorization (NMF) model, deemed \textit{Guided NMF}, that inc… ▽ More Fully unsupervised topic models have found fantastic success in document clustering and classification. However, these models often suffer from the tendency to learn less-than-meaningful or even redundant topics when the data is biased towards a set of features. For this reason, we propose an approach based upon the nonnegative matrix factorization (NMF) model, deemed \textit{Guided NMF}, that incorporates user-designed seed word supervision. Our experimental results demonstrate the promise of this model and illustrate that it is competitive with other methods of this ilk with only very little supervision information. △ Less

Submitted 5 February, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: 6 pages, 6 tables

arXiv:2010.07956 [pdf, other]

Semi-supervised NMF Models for Topic Modeling in Learning Tasks

Authors: Jamie Haddock, Lara Kassab, Sixian Li, Alona Kryshchenko, Rachel Grotheer, Elena Sizikova, Chuntian Wang, Thomas Merkh, R. W. M. A. Madushani, Miju Ahn, Deanna Needell, Kathryn Leonard

Abstract: We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learni… ▽ More We propose several new models for semi-supervised nonnegative matrix factorization (SSNMF) and provide motivation for SSNMF models as maximum likelihood estimators given specific distributions of uncertainty. We present multiplicative updates training methods for each new model, and demonstrate the application of these models to classification, although they are flexible to other supervised learning tasks. We illustrate the promise of these models and training methods on both synthetic and real data, and achieve high classification accuracy on the 20 Newsgroups dataset. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: 4 figures, 12 tables

arXiv:2010.01600 [pdf, other]

Sparseness-constrained Nonnegative Tensor Factorization for Detecting Topics at Different Time Scales

Authors: Lara Kassab, Alona Kryshchenko, Hanbaek Lyu, Denali Molitor, Deanna Needell, Elizaveta Rebrova, Jiahong Yuan

Abstract: Temporal data (such as news articles or Twitter feeds) often consists of a mixture of long-lasting trends and popular but short-lasting topics of interest. A truly successful topic modeling strategy should be able to detect both types of topics and clearly locate them in time. In this paper, we first show that nonnegative CANDECOMP/PARAFAC decomposition (NCPD) is able to discover topics of variabl… ▽ More Temporal data (such as news articles or Twitter feeds) often consists of a mixture of long-lasting trends and popular but short-lasting topics of interest. A truly successful topic modeling strategy should be able to detect both types of topics and clearly locate them in time. In this paper, we first show that nonnegative CANDECOMP/PARAFAC decomposition (NCPD) is able to discover topics of variable persistence automatically. Then, we propose sparseness-constrained NCPD (S-NCPD) and its online variant in order to actively control the length of the learned topics effectively and efficiently. Further, we propose quantitative ways to measure the topic length and demonstrate the ability of S-NCPD (as well as its online variant) to discover short and long-lasting temporal topics in a controlled manner in semi-synthetic and real-world data including news headlines. We also demonstrate that the online variant of S-NCPD reduces the reconstruction error more rapidly than S-NCPD. △ Less

Submitted 31 August, 2023; v1 submitted 4 October, 2020; originally announced October 2020.

arXiv:2009.09087 [pdf, other]

Feature Selection on Lyme Disease Patient Survey Data

Authors: Joshua Vendrow, Jamie Haddock, Deanna Needell, Lorraine Johnson

Abstract: Lyme disease is a rapidly growing illness that remains poorly understood within the medical community. Critical questions about when and why patients respond to treatment or stay ill, what kinds of treatments are effective, and even how to properly diagnose the disease remain largely unanswered. We investigate these questions by applying machine learning techniques to a large scale Lyme disease pa… ▽ More Lyme disease is a rapidly growing illness that remains poorly understood within the medical community. Critical questions about when and why patients respond to treatment or stay ill, what kinds of treatments are effective, and even how to properly diagnose the disease remain largely unanswered. We investigate these questions by applying machine learning techniques to a large scale Lyme disease patient registry, MyLymeData, developed by the nonprofit LymeDisease.org. We apply various machine learning methods in order to measure the effect of individual features in predicting participants' answers to the Global Rating of Change (GROC) survey questions that assess the self-reported degree to which their condition improved, worsened, or remained unchanged following antibiotic treatment. We use basic linear regression, support vector machines, neural networks, entropy-based decision tree models, and $k$-nearest neighbors approaches. We first analyze the general performance of the model and then identify the most important features for predicting participant answers to GROC. After we identify the "key" features, we separate them from the dataset and demonstrate the effectiveness of these features at identifying GROC. In doing so, we highlight possible directions for future study both mathematically and clinically. △ Less

Submitted 24 August, 2020; originally announced September 2020.

Comments: 9 pages, 8 figures, 6 tables

arXiv:2009.09074 [pdf, other]

COVID-19 Literature Topic-Based Search via Hierarchical NMF

Authors: Rachel Grotheer, Yihuan Huang, Pengyu Li, Elizaveta Rebrova, Deanna Needell, Longxiu Huang, Alona Kryshchenko, Xia Li, Kyung Ha, Oleksandr Kryshchenko

Abstract: A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics… ▽ More A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure. △ Less

Submitted 7 September, 2020; originally announced September 2020.

arXiv:2009.08089 [pdf, other]

Quantile-based Iterative Methods for Corrupted Systems of Linear Equations

Authors: Jamie Haddock, Deanna Needell, Elizaveta Rebrova, William Swartworth

Abstract: Often in applications ranging from medical imaging and sensor networks to error correction and data science (and beyond), one needs to solve large-scale linear systems in which a fraction of the measurements have been corrupted. We consider solving such large-scale systems of linear equations $\mathbf{A}\mathbf{x}=\mathbf{b}$ that are inconsistent due to corruptions in the measurement vector… ▽ More Often in applications ranging from medical imaging and sensor networks to error correction and data science (and beyond), one needs to solve large-scale linear systems in which a fraction of the measurements have been corrupted. We consider solving such large-scale systems of linear equations $\mathbf{A}\mathbf{x}=\mathbf{b}$ that are inconsistent due to corruptions in the measurement vector $\mathbf{b}$. We develop several variants of iterative methods that converge to the solution of the uncorrupted system of equations, even in the presence of large corruptions. These methods make use of a quantile of the absolute values of the residual vector in determining the iterate update. We present both theoretical and empirical results that demonstrate the promise of these iterative approaches. △ Less

Submitted 7 July, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

MSC Class: 65F10; 68W20; 60B20

arXiv:2009.07612 [pdf, other]

Online nonnegative CP-dictionary learning for Markovian data

Authors: Hanbaek Lyu, Christopher Strohmeier, Deanna Needell

Abstract: Online Tensor Factorization (OTF) is a fundamental tool in learning low-dimensional interpretable features from streaming multi-modal data. While various algorithmic and theoretical aspects of OTF have been investigated recently, a general convergence guarantee to stationary points of the objective function without any incoherence or sparsity assumptions is still lacking even for the i.i.d. case.… ▽ More Online Tensor Factorization (OTF) is a fundamental tool in learning low-dimensional interpretable features from streaming multi-modal data. While various algorithmic and theoretical aspects of OTF have been investigated recently, a general convergence guarantee to stationary points of the objective function without any incoherence or sparsity assumptions is still lacking even for the i.i.d. case. In this work, we introduce a novel algorithm that learns a CANDECOMP/PARAFAC (CP) basis from a given stream of tensor-valued data under general constraints, including nonnegativity constraints that induce interpretability of the learned CP basis. We prove that our algorithm converges almost surely to the set of stationary points of the objective function under the hypothesis that the sequence of data tensors is generated by an underlying Markov chain. Our setting covers the classical i.i.d. case as well as a wide range of application contexts including data streams generated by independent or MCMC sampling. Our result closes a gap between OTF and Online Matrix Factorization in global convergence analysis \commHL{for CP-decompositions}. Experimentally, we show that our algorithm converges much faster than standard algorithms for nonnegative tensor factorization tasks on both synthetic and real-world data. Also, we demonstrate the utility of our algorithm on a diverse set of examples from image, video, and time-series data, illustrating how one may learn qualitatively different CP-dictionaries from the same tensor data by exploiting the tensor structure in multiple ways. △ Less

Submitted 2 April, 2022; v1 submitted 16 September, 2020; originally announced September 2020.

Comments: 41 pages, 5 figures

arXiv:2009.01279 [pdf, other]

Clustering of Nonnegative Data and an Application to Matrix Completion

Authors: C. Strohmeier, D. Needell

Abstract: In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix completion algorithm which can outperform standard matrix completion algorithms on data matrices satisfying certain natural conditions. In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix completion algorithm which can outperform standard matrix completion algorithms on data matrices satisfying certain natural conditions. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:2007.15776 [pdf, other]

Random Vector Functional Link Networks for Function Approximation on Manifolds

Authors: Deanna Needell, Aaron A. Nelson, Rayan Saab, Palina Salanevich, Olov Schavemaker

Abstract: The learning speed of feed-forward neural networks is notoriously slow and has presented a bottleneck in deep learning applications for several decades. For instance, gradient-based learning algorithms, which are used extensively to train neural networks, tend to work slowly when all of the network parameters must be iteratively tuned. To counter this, both researchers and practitioners have tried… ▽ More The learning speed of feed-forward neural networks is notoriously slow and has presented a bottleneck in deep learning applications for several decades. For instance, gradient-based learning algorithms, which are used extensively to train neural networks, tend to work slowly when all of the network parameters must be iteratively tuned. To counter this, both researchers and practitioners have tried introducing randomness to reduce the learning requirement. Based on the original construction of Igelnik and Pao, single layer neural-networks with random input-to-hidden layer weights and biases have seen success in practice, but the necessary theoretical justification is lacking. In this paper, we begin to fill this theoretical gap. We provide a (corrected) rigorous proof that the Igelnik and Pao construction is a universal approximator for continuous functions on compact domains, with approximation error decaying asymptotically like $O(1/\sqrt{n})$ for the number $n$ of network nodes. We then extend this result to the non-asymptotic setting, proving that one can achieve any desired approximation error with high probability provided $n$ is sufficiently large. We further adapt this randomized neural network architecture to approximate functions on smooth, compact submanifolds of Euclidean space, providing theoretical guarantees in both the asymptotic and non-asymptotic forms. Finally, we illustrate our results on manifolds with numerical experiments. △ Less

Submitted 28 March, 2024; v1 submitted 30 July, 2020; originally announced July 2020.

Comments: 37 pages, 1 figure

MSC Class: 62M45

arXiv:2004.09112 [pdf, other]

COVID-19 Time-series Prediction by Joint Dictionary Learning and Online NMF

Authors: Hanbaek Lyu, Christopher Strohmeier, Georg Menz, Deanna Needell

Abstract: Predicting the spread and containment of COVID-19 is a challenge of utmost importance that the broader scientific community is currently facing. One of the main sources of difficulty is that a very limited amount of daily COVID-19 case data is available, and with few exceptions, the majority of countries are currently in the "exponential spread stage," and thus there is scarce information availabl… ▽ More Predicting the spread and containment of COVID-19 is a challenge of utmost importance that the broader scientific community is currently facing. One of the main sources of difficulty is that a very limited amount of daily COVID-19 case data is available, and with few exceptions, the majority of countries are currently in the "exponential spread stage," and thus there is scarce information available which would enable one to predict the phase transition between spread and containment. In this paper, we propose a novel approach to predicting the spread of COVID-19 based on dictionary learning and online nonnegative matrix factorization (online NMF). The key idea is to learn dictionary patterns of short evolution instances of the new daily cases in multiple countries at the same time, so that their latent correlation structures are captured in the dictionary patterns. We first learn such patterns by minibatch learning from the entire time-series and then further adapt them to the time-series by online NMF. As we progressively adapt and improve the learned dictionary patterns to the more recent observations, we also use them to make one-step predictions by the partial fitting. Lastly, by recursively applying the one-step predictions, we can extrapolate our predictions into the near future. Our prediction results can be directly attributed to the learned dictionary patterns due to their interpretability. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: 8 pages, 4 figures

arXiv:2003.09062 [pdf, other]

Tensor Completion through Total Variationwith Initialization from Weighted HOSVD

Authors: Zehan Chao, Longxiu Huang, Deanna Needell

Abstract: In our paper, we have studied the tensor completion problem when the sampling pattern is deterministic. We first propose a simple but efficient weighted HOSVD algorithm for recovery from noisy observations. Then we use the weighted HOSVD result as an initialization for the total variation. We have proved the accuracy of the weighted HOSVD algorithm from theoretical and numerical perspectives. In t… ▽ More In our paper, we have studied the tensor completion problem when the sampling pattern is deterministic. We first propose a simple but efficient weighted HOSVD algorithm for recovery from noisy observations. Then we use the weighted HOSVD result as an initialization for the total variation. We have proved the accuracy of the weighted HOSVD algorithm from theoretical and numerical perspectives. In the numerical simulation parts, we also showed that by using the proposed initialization, the total variation algorithm can efficiently fill the missing data for images and videos. △ Less

Submitted 19 March, 2020; originally announced March 2020.

Comments: 8 pages, 6 figures, ITA 2020

arXiv:2003.08537 [pdf, other]

HOSVD-Based Algorithm for Weighted Tensor Completion

Authors: Zehan Chao, Longxiu Huang, Deanna Needell

Abstract: Matrix completion, the problem of completing missing entries in a data matrix with low dimensional structure (such as rank), has seen many fruitful approaches and analyses. Tensor completion is the tensor analog, that attempts to impute missing tensor entries from similar low-rank type assumptions. In this paper, we study the tensor completion problem when the sampling pattern is deterministic and… ▽ More Matrix completion, the problem of completing missing entries in a data matrix with low dimensional structure (such as rank), has seen many fruitful approaches and analyses. Tensor completion is the tensor analog, that attempts to impute missing tensor entries from similar low-rank type assumptions. In this paper, we study the tensor completion problem when the sampling pattern is deterministic and possibly non-uniform. We first propose an efficient weighted HOSVD algorithm for recovery of the underlying low-rank tensor from noisy observations and then derive the error bounds under a properly weighted metric. Additionally, the efficiency and accuracy of our algorithm are both tested using synthetic and real datasets in numerical simulations. △ Less

Submitted 6 July, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

MSC Class: 15A69; 15A83; 65F30; 68P99; 68W20; 65F99

Journal ref: journal of imaging, 2021

arXiv:2002.04126 [pdf, other]

Randomized Kaczmarz with Averaging

Authors: Jacob D. Moorman, Thomas K. Tu, Denali Molitor, Deanna Needell

Abstract: The randomized Kaczmarz (RK) method is an iterative method for approximating the least-squares solution of large linear systems of equations. The standard RK method uses sequential updates, making parallel computation difficult. Here, we study a parallel version of RK where a weighted average of independent updates is used. We analyze the convergence of RK with averaging and demonstrate its perfor… ▽ More The randomized Kaczmarz (RK) method is an iterative method for approximating the least-squares solution of large linear systems of equations. The standard RK method uses sequential updates, making parallel computation difficult. Here, we study a parallel version of RK where a weighted average of independent updates is used. We analyze the convergence of RK with averaging and demonstrate its performance empirically. We show that as the number of threads increases, the rate of convergence improves and the convergence horizon for inconsistent systems decreases. △ Less

Submitted 10 February, 2020; originally announced February 2020.

Comments: 19 pages, 9 figures

MSC Class: 15A06; 15B52; 65F10; 65F20; 65Y20; 68Q25; 68W10; 68W20; 68W40

arXiv:2002.02041 [pdf, other]

An Adaptation for Iterative Structured Matrix Completion

Authors: Henry Adams, Lara Kassab, Deanna Needell

Abstract: The task of predicting missing entries of a matrix, from a subset of known entries, is known as \textit{matrix completion}. In today's data-driven world, data completion is essential whether it is the main goal or a pre-processing step. Structured matrix completion includes any setting in which data is not missing uniformly at random. In recent work, a modification to the standard nuclear norm min… ▽ More The task of predicting missing entries of a matrix, from a subset of known entries, is known as \textit{matrix completion}. In today's data-driven world, data completion is essential whether it is the main goal or a pre-processing step. Structured matrix completion includes any setting in which data is not missing uniformly at random. In recent work, a modification to the standard nuclear norm minimization (NNM) for matrix completion has been developed to take into account \emph{sparsity-based} structure in the missing entries. This notion of structure is motivated in many settings including recommender systems, where the probability that an entry is observed depends on the value of the entry. We propose adjusting an Iteratively Reweighted Least Squares (IRLS) algorithm for low-rank matrix completion to take into account sparsity-based structure in the missing entries. We also present an iterative gradient-projection-based implementation of the algorithm that can handle large-scale matrices. Finally, we present a robust array of numerical experiments on matrices of varying sizes, ranks, and level of structure. We show that our proposed method is comparable with the adjusted NNM on small-sized matrices, and often outperforms the IRLS algorithm in structured settings on matrices up to size $1000 \times 1000$. △ Less

Submitted 14 May, 2021; v1 submitted 5 February, 2020; originally announced February 2020.

MSC Class: 15A83; 65F55 (Primary); 65F50 (Secondary)

arXiv:2001.00631 [pdf, other]

On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition

Authors: Miju Ahn, Nicole Eikmeier, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Kathryn Leonard, Deanna Needell, R. W. M. A. Madushani, Elena Sizikova, Chuntian Wang

Abstract: There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of t… ▽ More There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of the data tensor are each factorized into the product of lower-dimensional nonnegative matrices. With this approach, however, information contained in the temporal dimension of the data is often neglected or underutilized. To overcome this issue, we propose instead adopting the method of nonnegative CANDECOMP/PARAPAC (CP) tensor decomposition (NNCPD), where the data tensor is directly decomposed into a minimal sum of outer products of nonnegative vectors, thereby preserving the temporal information. The viability of NNCPD is demonstrated through application to both synthetic and real data, where significantly improved results are obtained compared to those of typical NMF-based methods. The advantages of NNCPD over such approaches are studied and discussed. To the best of our knowledge, this is the first time that NNCPD has been utilized for the purpose of dynamic topic modeling, and our findings will be transformative for both applications and further developments. △ Less

Submitted 14 October, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

Comments: 23 pages, 29 figures, submitted to Women in Data Science and Mathematics (WiSDM) Workshop Proceedings, "Advances in Data Science", AWM-Springer series

arXiv:1912.08294 [pdf, other]

Lower Memory Oblivious (Tensor) Subspace Embeddings with Fewer Random Bits: Modewise Methods for Least Squares

Authors: M. A. Iwen, D. Needell, E. Rebrova, A. Zare

Abstract: In this paper new general modewise Johnson-Lindenstrauss (JL) subspace embeddings are proposed that are both considerably faster to generate and easier to store than traditional JL embeddings when working with extremely large vectors and/or tensors. Corresponding embedding results are then proven for two different types of low-dimensional (tensor) subspaces. The first of these new subspace embed… ▽ More In this paper new general modewise Johnson-Lindenstrauss (JL) subspace embeddings are proposed that are both considerably faster to generate and easier to store than traditional JL embeddings when working with extremely large vectors and/or tensors. Corresponding embedding results are then proven for two different types of low-dimensional (tensor) subspaces. The first of these new subspace embedding results produces improved space complexity bounds for embeddings of rank-$r$ tensors whose CP decompositions are contained in the span of a fixed (but unknown) set of $r$ rank-one basis tensors. In the traditional vector setting this first result yields new and very general near-optimal oblivious subspace embedding constructions that require fewer random bits to generate than standard JL embeddings when embedding subspaces of $\mathbb{C}^N$ spanned by basis vectors with special Kronecker structure. The second result proven herein provides new fast JL embeddings of arbitrary $r$-dimensional subspaces $\mathcal{S} \subset \mathbb{C}^N$ which also require fewer random bits (and so are easier to store - i.e., require less space) than standard fast JL embedding methods in order to achieve small $ε$-distortions. These new oblivious subspace embedding results work by $(i)$ effectively folding any given vector in $\mathcal{S}$ into a (not necessarily low-rank) tensor, and then $(ii)$ embedding the resulting tensor into $\mathbb{C}^m$ for $m \leq C r \log^c(N) / ε^2$. Applications related to compression and fast compressed least squares solution methods are also considered, including those used for fitting low-rank CP decompositions, and the proposed JL embedding results are shown to work well numerically in both settings. △ Less

Submitted 16 December, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

arXiv:1912.00771 [pdf, other]

Sketching for Motzkin's Iterative Method for Linear Systems

Authors: Elizaveta Rebrova, Deanna Needell

Abstract: Projection-based iterative methods for solving large over-determined linear systems are well-known for their simplicity and computational efficiency. It is also known that the correct choice of a sketching procedure (i.e., preprocessing steps that reduce the dimension of each iteration) can improve the performance of iterative methods in multiple ways, such as, to speed up the convergence of the m… ▽ More Projection-based iterative methods for solving large over-determined linear systems are well-known for their simplicity and computational efficiency. It is also known that the correct choice of a sketching procedure (i.e., preprocessing steps that reduce the dimension of each iteration) can improve the performance of iterative methods in multiple ways, such as, to speed up the convergence of the method by fighting inner correlations of the system, or to reduce the variance incurred by the presence of noise. In the current work, we show that sketching can also help us to get better theoretical guarantees for the projection-based methods. Specifically, we use good properties of Gaussian sketching to prove an accelerated convergence rate of the sketched relaxation (also known as Motzkin's) method. The new estimates hold for linear systems of arbitrary structure. We also provide numerical experiments in support of our theoretical analysis of the sketched relaxation method. △ Less

Submitted 28 November, 2019; originally announced December 2019.

arXiv:1912.00315 [pdf, other]

Topic-aware chatbot using Recurrent Neural Networks and Nonnegative Matrix Factorization

Authors: Yuchen Guo, Nicholas Hanoian, Zhexiao Lin, Nicholas Liskij, Hanbaek Lyu, Deanna Needell, Jiahao Qu, Henry Sojico, Yuliang Wang, Zhe Xiong, Zhenhong Zou

Abstract: We propose a novel model for a topic-aware chatbot by combining the traditional Recurrent Neural Network (RNN) encoder-decoder model with a topic attention layer based on Nonnegative Matrix Factorization (NMF). After learning topic vectors from an auxiliary text corpus via NMF, the decoder is trained so that it is more likely to sample response words from the most correlated topic vectors. One of… ▽ More We propose a novel model for a topic-aware chatbot by combining the traditional Recurrent Neural Network (RNN) encoder-decoder model with a topic attention layer based on Nonnegative Matrix Factorization (NMF). After learning topic vectors from an auxiliary text corpus via NMF, the decoder is trained so that it is more likely to sample response words from the most correlated topic vectors. One of the main advantages in our architecture is that the user can easily switch the NMF-learned topic vectors so that the chatbot obtains desired topic-awareness. We demonstrate our model by training on a single conversational data set which is then augmented with topic matrices learned from different auxiliary data sets. We show that our topic-aware chatbot not only outperforms the non-topic counterpart, but also that each topic-aware model qualitatively and contextually gives the most relevant answer depending on the topic of question. △ Less

Submitted 4 December, 2019; v1 submitted 30 November, 2019; originally announced December 2019.

Comments: 14 pages, 1 figure, 2 tables

arXiv:1911.01931 [pdf, other]

Online matrix factorization for Markovian data and applications to Network Dictionary Learning

Authors: Hanbaek Lyu, Deanna Needell, Laura Balzano

Abstract: Online Matrix Factorization (OMF) is a fundamental tool for dictionary learning problems, giving an approximate representation of complex data sets in terms of a reduced number of extracted features. Convergence guarantees for most of the OMF algorithms in the literature assume independence between data matrices, and the case of dependent data streams remains largely unexplored. In this paper, we… ▽ More Online Matrix Factorization (OMF) is a fundamental tool for dictionary learning problems, giving an approximate representation of complex data sets in terms of a reduced number of extracted features. Convergence guarantees for most of the OMF algorithms in the literature assume independence between data matrices, and the case of dependent data streams remains largely unexplored. In this paper, we show that a non-convex generalization of the well-known OMF algorithm for i.i.d. stream of data in \citep{mairal2010online} converges almost surely to the set of critical points of the expected loss function, even when the data matrices are functions of some underlying Markov chain satisfying a mild mixing condition. This allows one to extract features more efficiently from dependent data streams, as there is no need to subsample the data sequence to approximately satisfy the independence assumption. As the main application, by combining online non-negative matrix factorization and a recent MCMC algorithm for sampling motifs from networks, we propose a novel framework of Network Dictionary Learning, which extracts ``network dictionary patches' from a given network in an online manner that encodes main features of the network. We demonstrate this technique and its application to network denoising problems on real-world network data. △ Less

Submitted 7 November, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: 39 pages, 13 figures

Journal ref: Journal of Machine Learning Research 21 (2020)

arXiv:1910.13986 [pdf, other]

Weighted matrix completion from non-random, non-uniform sampling patterns

Authors: Simon Foucart, Deanna Needell, Reese Pathak, Yaniv Plan, Mary Wootters

Abstract: We study the matrix completion problem when the observation pattern is deterministic and possibly non-uniform. We propose a simple and efficient debiased projection scheme for recovery from noisy observations and analyze the error under a suitable weighted metric. We introduce a simple function of the weight matrix and the sampling pattern that governs the accuracy of the recovered matrix. We deri… ▽ More We study the matrix completion problem when the observation pattern is deterministic and possibly non-uniform. We propose a simple and efficient debiased projection scheme for recovery from noisy observations and analyze the error under a suitable weighted metric. We introduce a simple function of the weight matrix and the sampling pattern that governs the accuracy of the recovered matrix. We derive theoretical guarantees that upper bound the recovery error and nearly matching lower bounds that showcase optimality in several regimes. Our numerical experiments demonstrate the computational efficiency and accuracy of our approach, and show that debiasing is essential when using non-uniform sampling patterns. △ Less

Submitted 30 October, 2019; originally announced October 2019.

Comments: 41 pages, 4 figures

arXiv:1909.10132 [pdf, other]

Stochastic Iterative Hard Thresholding for Low-Tucker-Rank Tensor Recovery

Authors: Rachel Grotheer, Shuang Li, Anna Ma, Deanna Needell, **g Qin

Abstract: Low-rank tensor recovery problems have been widely studied in many applications of signal processing and machine learning. Tucker decomposition is known as one of the most popular decompositions in the tensor framework. In recent years, researchers have developed many state-of-the-art algorithms to address the problem of low-Tucker-rank tensor recovery. Motivated by the favorable properties of the… ▽ More Low-rank tensor recovery problems have been widely studied in many applications of signal processing and machine learning. Tucker decomposition is known as one of the most popular decompositions in the tensor framework. In recent years, researchers have developed many state-of-the-art algorithms to address the problem of low-Tucker-rank tensor recovery. Motivated by the favorable properties of the stochastic algorithms, such as stochastic gradient descent and stochastic iterative hard thresholding, we aim to extend the well-known stochastic iterative hard thresholding algorithm to the tensor framework in order to address the problem of recovering a low-Tucker-rank tensor from its linear measurements. We have also developed linear convergence analysis for the proposed method and conducted a series of experiments with both synthetic and real data to illustrate the performance of the proposed method. △ Less

Submitted 16 July, 2020; v1 submitted 22 September, 2019; originally announced September 2019.

arXiv:1909.03604 [pdf, other]

Adaptive Sketch-and-Project Methods for Solving Linear Systems

Authors: Robert Gower, Denali Molitor, Jacob Moorman, Deanna Needell

Abstract: We present new adaptive sampling rules for the sketch-and-project method for solving linear systems. To deduce our new sampling rules, we first show how the progress of one step of the sketch-and-project method depends directly on a sketched residual. Based on this insight, we derive a 1) max-distance sampling rule, by sampling the sketch with the largest sketched residual 2) a proportional sampli… ▽ More We present new adaptive sampling rules for the sketch-and-project method for solving linear systems. To deduce our new sampling rules, we first show how the progress of one step of the sketch-and-project method depends directly on a sketched residual. Based on this insight, we derive a 1) max-distance sampling rule, by sampling the sketch with the largest sketched residual 2) a proportional sampling rule, by sampling proportional to the sketched residual, and finally 3) a capped sampling rule. The capped sampling rule is a generalization of the recently introduced adaptive sampling rules for the Kaczmarz method. We provide a global linear convergence theorem for each sampling rule and show that the max-distance rule enjoys the fastest convergence. This finding is also verified in extensive numerical experiments that lead us to conclude that the max-distance sampling rule is superior both experimentally and theoretically to the capped sampling rule. We also provide numerical insights into implementing the adaptive strategies so that the per iteration cost is of the same order as using a fixed sampling strategy when the number of sketches times the sketch size is not significantly larger than the number of columns. △ Less

Submitted 8 September, 2019; originally announced September 2019.

MSC Class: 15A06; 15B52; 65F10; 68W20; 65N75; 65Y20; 68Q25; 68W40; 90C20

arXiv:1908.08479 [pdf, other]

Iterative Hard Thresholding for Low CP-rank Tensor Models

Authors: Rachel Grotheer, Shuang Li, Anna Ma, Deanna Needell, **g Qin

Abstract: Recovery of low-rank matrices from a small number of linear measurements is now well-known to be possible under various model assumptions on the measurements. Such results demonstrate robustness and are backed with provable theoretical guarantees. However, extensions to tensor recovery have only recently began to be studied and developed, despite an abundance of practical tensor applications. Rece… ▽ More Recovery of low-rank matrices from a small number of linear measurements is now well-known to be possible under various model assumptions on the measurements. Such results demonstrate robustness and are backed with provable theoretical guarantees. However, extensions to tensor recovery have only recently began to be studied and developed, despite an abundance of practical tensor applications. Recently, a tensor variant of the Iterative Hard Thresholding method was proposed and theoretical results were obtained that guarantee exact recovery of tensors with low Tucker rank. In this paper, we utilize the same tensor version of the Restricted Isometry Property (RIP) to extend these results for tensors with low CANDECOMP/PARAFAC (CP) rank. In doing so, we leverage recent results on efficient approximations of CP decompositions that remove the need for challenging assumptions in prior works. We complement our theoretical findings with empirical results that showcase the potential of the approach. △ Less

Submitted 22 August, 2019; originally announced August 2019.

arXiv:1907.11746 [pdf, other]

Bias of Homotopic Gradient Descent for the Hinge Loss

Authors: Denali Molitor, Deanna Needell, Rachel Ward

Abstract: Gradient descent is a simple and widely used optimization method for machine learning. For homogeneous linear classifiers applied to separable data, gradient descent has been shown to converge to the maximal margin (or equivalently, the minimal norm) solution for various smooth loss functions. The previous theory does not, however, apply to non-smooth functions such as the hinge loss which is wide… ▽ More Gradient descent is a simple and widely used optimization method for machine learning. For homogeneous linear classifiers applied to separable data, gradient descent has been shown to converge to the maximal margin (or equivalently, the minimal norm) solution for various smooth loss functions. The previous theory does not, however, apply to non-smooth functions such as the hinge loss which is widely used in practice. Here, we study the convergence of a homotopic variant of gradient descent applied to the hinge loss and provide explicit convergence rates to the max-margin solution for linearly separable data. △ Less

Submitted 26 July, 2019; originally announced July 2019.

arXiv:1907.03028 [pdf, other]

On Inferences from Completed Data

Authors: Jamie Haddock, Denali Molitor, Deanna Needell, Sneha Sambandam, Joy Song, Simon Sun

Abstract: Matrix completion has become an extremely important technique as data scientists are routinely faced with large, incomplete datasets on which they wish to perform statistical inferences. We investigate how error introduced via matrix completion affects statistical inference. Furthermore, we prove recovery error bounds which depend upon the matrix recovery error for several common statistical infer… ▽ More Matrix completion has become an extremely important technique as data scientists are routinely faced with large, incomplete datasets on which they wish to perform statistical inferences. We investigate how error introduced via matrix completion affects statistical inference. Furthermore, we prove recovery error bounds which depend upon the matrix recovery error for several common statistical inferences. We consider matrix recovery via nuclear norm minimization and a variant, $\ell_1$-regularized nuclear norm minimization for data with a structured sampling pattern. Finally, we run a series of numerical experiments on synthetic data and real patient surveys from MyLymeData, which illustrate the relationship between inference recovery error and matrix recovery error. These results indicate that exact matrix recovery is often not necessary to achieve small inference recovery error. △ Less

Submitted 5 July, 2019; originally announced July 2019.

arXiv:1905.13404 [pdf, other]

Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing

Authors: Jesus A. De Loera, Jamie Haddock, Anna Ma, Deanna Needell

Abstract: Machine learning algorithms typically rely on optimization subroutines and are well-known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning algorithms lead to more effective outcomes for optimization problems? Our goal is to train machine learning methods to automatically improve the performance of optimizat… ▽ More Machine learning algorithms typically rely on optimization subroutines and are well-known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning algorithms lead to more effective outcomes for optimization problems? Our goal is to train machine learning methods to automatically improve the performance of optimization and signal processing algorithms. As a proof of concept, we use our approach to improve two popular data processing subroutines in data science: stochastic gradient descent and greedy methods in compressed sensing. We provide experimental results that demonstrate the answer is ``yes'', machine learning algorithms do lead to more effective outcomes for optimization problems, and show the future potential for this research direction. △ Less

Submitted 26 July, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

arXiv:1905.08894 [pdf, other]

On block Gaussian sketching for the Kaczmarz method

Authors: Deanna Needell, Elizaveta Rebrova

Abstract: The Kaczmarz algorithm is one of the most popular methods for solving large-scale over-determined linear systems due to its simplicity and computational efficiency. This method can be viewed as a special instance of a more general class of sketch and project methods. Recently, a block Gaussian version was proposed that uses a block Gaussian sketch, enjoying the regularization properties of Gaussia… ▽ More The Kaczmarz algorithm is one of the most popular methods for solving large-scale over-determined linear systems due to its simplicity and computational efficiency. This method can be viewed as a special instance of a more general class of sketch and project methods. Recently, a block Gaussian version was proposed that uses a block Gaussian sketch, enjoying the regularization properties of Gaussian sketching, combined with the acceleration of the block variants. Theoretical analysis was only provided for the non-block version of the Gaussian sketch method. Here, we provide theoretical guarantees for the block Gaussian Kaczmarz method, proving a number of convergence results showing convergence to the solution exponentially fast in expectation. On the flip side, with this theory and extensive experimental support, we observe that the numerical complexity of each iteration typically makes this method inferior to other iterative projection methods. We highlight only one setting in which it may be advantageous, namely when the regularizing effect is used to reduce variance in the iterates under certain noise models and convergence for some particular matrix constructions. △ Less

Submitted 21 January, 2020; v1 submitted 21 May, 2019; originally announced May 2019.

MSC Class: 65F10; 68W20; 60B20

arXiv:1904.08540 [pdf, other]

Matrix Completion With Selective Sampling

Authors: Christian Parkinson, Kevin Huynh, Deanna Needell

Abstract: Matrix completion is a classical problem in data science wherein one attempts to reconstruct a low-rank matrix while only observing some subset of the entries. Previous authors have phrased this problem as a nuclear norm minimization problem. Almost all previous work assumes no explicit structure of the matrix and uses uniform sampling to decide the observed entries. We suggest methods for selecti… ▽ More Matrix completion is a classical problem in data science wherein one attempts to reconstruct a low-rank matrix while only observing some subset of the entries. Previous authors have phrased this problem as a nuclear norm minimization problem. Almost all previous work assumes no explicit structure of the matrix and uses uniform sampling to decide the observed entries. We suggest methods for selective sampling in the case where we have some knowledge about the structure of the matrix and are allowed to design the observation set. △ Less

Submitted 17 April, 2019; originally announced April 2019.

Comments: 4 pages, 4 figures

arXiv:1902.02862 [pdf, other]

Lattices from tight frames and vertex transitive graphs

Authors: Lenny Fukshansky, Deanna Needell, Josiah Park, Yuxin Xin

Abstract: We show that real tight frames that generate lattices must be rational, and use this observation to describe a construction of lattices from vertex transitive graphs. In the case of irreducible group frames, we show that the corresponding lattice is always strongly eutactic. This is the case for the more restrictive class of distance transitive graphs. We show that such lattices exist in arbitrari… ▽ More We show that real tight frames that generate lattices must be rational, and use this observation to describe a construction of lattices from vertex transitive graphs. In the case of irreducible group frames, we show that the corresponding lattice is always strongly eutactic. This is the case for the more restrictive class of distance transitive graphs. We show that such lattices exist in arbitrarily large dimensions and demonstrate examples arising from some notable families of graphs. In particular, some well-known root lattices and those related to them can be recovered this way. We discuss various properties of this construction and also mention some potential applications of lattices generated by incoherent systems of vectors. △ Less

Submitted 18 August, 2019; v1 submitted 7 February, 2019; originally announced February 2019.

Comments: some corrections of typos made; updated bibliography

MSC Class: 11H31; 52C17; 42C15; 05C50; 05C76

arXiv:1809.03041 [pdf, other]

An iterative method for classification of binary data

Authors: Denali Molitor, Deanna Needell

Abstract: In today's data driven world, storing, processing, and gleaning insights from large-scale data are major challenges. Data compression is often required in order to store large amounts of high-dimensional data, and thus, efficient inference methods for analyzing compressed data are necessary. Building on a recently designed simple framework for classification using binary data, we demonstrate that… ▽ More In today's data driven world, storing, processing, and gleaning insights from large-scale data are major challenges. Data compression is often required in order to store large amounts of high-dimensional data, and thus, efficient inference methods for analyzing compressed data are necessary. Building on a recently designed simple framework for classification using binary data, we demonstrate that one can improve classification accuracy of this approach through iterative applications whose output serves as input to the next application. As a side consequence, we show that the original framework can be used as a data preprocessing step to improve the performance of other methods, such as support vector machines. For several simple settings, we showcase the ability to obtain theoretical guarantees for the accuracy of the iterative classification method. The simplicity of the underlying classification framework makes it amenable to theoretical analysis and studying this approach will hopefully serve as a step toward develo** theory for more sophisticated deep learning technologies. △ Less

Submitted 9 September, 2018; originally announced September 2018.

MSC Class: 68T05; 68P30; 68U10

arXiv:1808.04421 [pdf, other]

Tribracket Modules

Authors: Deanna Needell, Sam Nelson, Yingqi Shi

Abstract: Niebrzydowski tribrackets are ternary operations on sets satisfying conditions obtained from the oriented Reidemeister moves such that the set of tribracket colorings of an oriented knot or link diagram is an invariant of oriented knots and links. We introduce tribracket modules analogous to quandle/biquandle/rack modules and use these structures to enhance the tribracket counting invariant. We pr… ▽ More Niebrzydowski tribrackets are ternary operations on sets satisfying conditions obtained from the oriented Reidemeister moves such that the set of tribracket colorings of an oriented knot or link diagram is an invariant of oriented knots and links. We introduce tribracket modules analogous to quandle/biquandle/rack modules and use these structures to enhance the tribracket counting invariant. We provide examples to illustrate the computation of the invariant and show that the enhancement is proper. △ Less

Submitted 26 August, 2019; v1 submitted 13 August, 2018; originally announced August 2018.

Comments: 11 pages, v2 contains typo corrections and other small improvements

MSC Class: 57M27; 57M25

arXiv:1807.08825 [pdf, other]

Hierarchical Classification using Binary Data

Authors: Denali Molitor, Deanna Needell

Abstract: In classification problems, especially those that categorize data into a large number of classes, the classes often naturally follow a hierarchical structure. That is, some classes are likely to share similar structures and features. Those characteristics can be captured by considering a hierarchical relationship among the class labels. Here, we extend a recent simple classification approach on bi… ▽ More In classification problems, especially those that categorize data into a large number of classes, the classes often naturally follow a hierarchical structure. That is, some classes are likely to share similar structures and features. Those characteristics can be captured by considering a hierarchical relationship among the class labels. Here, we extend a recent simple classification approach on binary data in order to efficiently classify hierarchical data. In certain settings, specifically, when some classes are significantly easier to identify than others, we showcase computational and accuracy advantages. △ Less

Submitted 23 July, 2018; originally announced July 2018.

Comments: AAAI Magazine special Issue on Deep Models, Machine Learning and Artificial Intelligence Applications in National and International Security, June, 2018

arXiv:1807.04839 [pdf, other]

doi 10.1109/TSP.2019.2899286

An Approximate Message Passing Framework for Side Information

Authors: Anna Ma, You, Zhou, Cynthia Rush, Dror Baron, Deanna Needell

Abstract: Approximate message passing (AMP) methods have gained recent traction in sparse signal recovery. Additional information about the signal, or \emph{side information} (SI), is commonly available and can aid in efficient signal recovery. This work presents an AMP-based framework that exploits SI and can be readily implemented in various settings for which the SI results in separable distributions. To… ▽ More Approximate message passing (AMP) methods have gained recent traction in sparse signal recovery. Additional information about the signal, or \emph{side information} (SI), is commonly available and can aid in efficient signal recovery. This work presents an AMP-based framework that exploits SI and can be readily implemented in various settings for which the SI results in separable distributions. To illustrate the simplicity and applicability of our approach, this framework is applied to a Bernoulli-Gaussian (BG) model and a time-varying birth-death-drift (BDD) signal model, motivated by applications in channel estimation. We develop a suite of algorithms, called AMP-SI, and derive denoisers for the BDD and BG models. Numerical evidence demonstrating the advantages of our approach are presented alongside empirical evidence of the accuracy of a proposed state evolution. △ Less

Submitted 2 May, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

arXiv:1805.12529 [pdf, other]

Analysis of Fast Structured Dictionary Learning

Authors: Saiprasad Ravishankar, Anna Ma, Deanna Needell

Abstract: Sparsity-based models and techniques have been exploited in many signal processing and imaging applications. Data-driven methods based on dictionary and sparsifying transform learning enable learning rich image features from data, and can outperform analytical models. In particular, alternating optimization algorithms have been popular for learning such models. In this work, we focus on alternatin… ▽ More Sparsity-based models and techniques have been exploited in many signal processing and imaging applications. Data-driven methods based on dictionary and sparsifying transform learning enable learning rich image features from data, and can outperform analytical models. In particular, alternating optimization algorithms have been popular for learning such models. In this work, we focus on alternating minimization for a specific structured unitary sparsifying operator learning problem, and provide a convergence analysis. While the algorithm converges to the critical points of the problem generally, our analysis establishes under mild assumptions, the local linear convergence of the algorithm to the underlying sparsifying model of the data. Analysis and numerical simulations show that our assumptions hold for standard probabilistic data models. In practice, the algorithm is robust to initialization. △ Less

Submitted 23 September, 2019; v1 submitted 31 May, 2018; originally announced May 2018.

Comments: This article has been accepted for publication in Information and Inference Published by Oxford University Press

arXiv:1803.08114 [pdf, ps, other]

doi 10.1063/1.5044141

Randomized Projection Methods for Linear Systems with Arbitrarily Large Sparse Corruptions

Authors: Jamie Haddock, Deanna Needell

Abstract: In applications like medical imaging, error correction, and sensor networks, one needs to solve large-scale linear systems that may be corrupted by a small number of arbitrarily large corruptions. We consider solving such large-scale systems of linear equations $A\mathbf{x}=\mathbf{b}$ that are inconsistent due to corruptions in the measurement vector $\mathbf{b}$. With this as our motivating exam… ▽ More In applications like medical imaging, error correction, and sensor networks, one needs to solve large-scale linear systems that may be corrupted by a small number of arbitrarily large corruptions. We consider solving such large-scale systems of linear equations $A\mathbf{x}=\mathbf{b}$ that are inconsistent due to corruptions in the measurement vector $\mathbf{b}$. With this as our motivating example, we develop an approach for this setting that allows detection of the corrupted entries and thus convergence to the "true" solution of the original system. We provide analytical justification for our approaches as well as experimental evidence on real and synthetic systems. △ Less

Submitted 22 December, 2018; v1 submitted 21 March, 2018; originally announced March 2018.

MSC Class: 65F10; 65F20; 65F22

arXiv:1802.03126 [pdf, ps, other]

On Motzkin's Method for Inconsistent Linear Systems

Authors: Jamie Haddock, Deanna Needell

Abstract: Iterative linear solvers have gained recent popularity due to their computational efficiency and low memory footprint for large-scale linear systems. The relaxation method, or Motzkin's method, can be viewed as an iterative method that projects the current estimation onto the solution hyperplane corresponding to the most violated constraint. Although this leads to an optimal selection strategy for… ▽ More Iterative linear solvers have gained recent popularity due to their computational efficiency and low memory footprint for large-scale linear systems. The relaxation method, or Motzkin's method, can be viewed as an iterative method that projects the current estimation onto the solution hyperplane corresponding to the most violated constraint. Although this leads to an optimal selection strategy for consistent systems, for inconsistent least square problems, the strategy presents a tradeoff between convergence rate and solution accuracy. We provide a theoretical analysis that shows Motzkin's method offers an initially accelerated convergence rate and this acceleration depends on the dynamic range of the residual. We quantify this acceleration for Gaussian systems as a concrete example. Lastly, we include experimental evidence on real and synthetic systems that support the analysis. △ Less

Submitted 26 October, 2018; v1 submitted 8 February, 2018; originally announced February 2018.

MSC Class: 15A06; 65F10; 65F20; 65F22

arXiv:1802.00518 [pdf, other]

Analysis of Fast Alternating Minimization for Structured Dictionary Learning

Authors: Saiprasad Ravishankar, Anna Ma, Deanna Needell

Abstract: Methods exploiting sparsity have been popular in imaging and signal processing applications including compression, denoising, and imaging inverse problems. Data-driven approaches such as dictionary learning and transform learning enable one to discover complex image features from datasets and provide promising performance over analytical models. Alternating minimization algorithms have been partic… ▽ More Methods exploiting sparsity have been popular in imaging and signal processing applications including compression, denoising, and imaging inverse problems. Data-driven approaches such as dictionary learning and transform learning enable one to discover complex image features from datasets and provide promising performance over analytical models. Alternating minimization algorithms have been particularly popular in dictionary or transform learning. In this work, we study the properties of alternating minimization for structured (unitary) sparsifying operator learning. While the algorithm converges to the stationary points of the non-convex problem in general, we prove rapid local linear convergence to the underlying generative model under mild assumptions. Our experiments show that the unitary operator learning algorithm is robust to initialization. △ Less

Submitted 1 February, 2018; originally announced February 2018.

arXiv:1801.10264 [pdf, other]

Compressed Anomaly Detection with Multiple Mixed Observations

Authors: Natalie Durgin, Rachel Grotheer, Chenxi Huang, Shuang Li, Anna Ma, Deanna Needell, **g Qin

Abstract: We consider a collection of independent random variables that are identically distributed, except for a small subset which follows a different, anomalous distribution. We study the problem of detecting which random variables in the collection are governed by the anomalous distribution. Recent work proposes to solve this problem by conducting hypothesis tests based on mixed observations (e.g. linea… ▽ More We consider a collection of independent random variables that are identically distributed, except for a small subset which follows a different, anomalous distribution. We study the problem of detecting which random variables in the collection are governed by the anomalous distribution. Recent work proposes to solve this problem by conducting hypothesis tests based on mixed observations (e.g. linear combinations) of the random variables. Recognizing the connection between taking mixed observations and compressed sensing, we view the problem as recovering the "support" (index set) of the anomalous random variables from multiple measurement vectors (MMVs). Many algorithms have been developed for recovering jointly sparse signals and their support from MMVs. We establish the theoretical and empirical effectiveness of these algorithms at detecting anomalies. We also extend the LASSO algorithm to an MMV version for our purpose. Further, we perform experiments on synthetic data, consisting of samples from the random variables, to explore the trade-off between the number of mixed observations per sample and the number of samples required to detect anomalies. △ Less

Submitted 19 June, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

Comments: 27 pages, 9 figures. Incorporates reviewer feedback, additional experiments, and additional figures

arXiv:1801.09657 [pdf, other]

Matrix Completion for Structured Observations

Authors: Denali Molitor, Deanna Needell

Abstract: The need to predict or fill-in missing data, often referred to as matrix completion, is a common challenge in today's data-driven world. Previous strategies typically assume that no structural difference between observed and missing entries exists. Unfortunately, this assumption is woefully unrealistic in many applications. For example, in the classic Netflix challenge, in which one hopes to predi… ▽ More The need to predict or fill-in missing data, often referred to as matrix completion, is a common challenge in today's data-driven world. Previous strategies typically assume that no structural difference between observed and missing entries exists. Unfortunately, this assumption is woefully unrealistic in many applications. For example, in the classic Netflix challenge, in which one hopes to predict user-movie ratings for unseen films, the fact that the viewer has not watched a given movie may indicate a lack of interest in that movie, thus suggesting a lower rating than otherwise expected. We propose adjusting the standard nuclear norm minimization strategy for matrix completion to account for such structural differences between observed and unobserved entries by regularizing the values of the unobserved entries. We show that the proposed method outperforms nuclear norm minimization in certain settings. △ Less

Submitted 29 January, 2018; originally announced January 2018.

arXiv:1801.01526 [pdf, other]

An algebraic perspective on integer sparse recovery

Authors: Lenny Fukshansky, Deanna Needell, Benny Sudakov

Abstract: Compressed sensing is a relatively new mathematical paradigm that shows a small number of linear measurements are enough to efficiently reconstruct a large dimensional signal under the assumption the signal is sparse. Applications for this technology are ubiquitous, ranging from wireless communications to medical imaging, and there is now a solid foundation of mathematical theory and algorithms to… ▽ More Compressed sensing is a relatively new mathematical paradigm that shows a small number of linear measurements are enough to efficiently reconstruct a large dimensional signal under the assumption the signal is sparse. Applications for this technology are ubiquitous, ranging from wireless communications to medical imaging, and there is now a solid foundation of mathematical theory and algorithms to robustly and efficiently reconstruct such signals. However, in many of these applications, the signals of interest do not only have a sparse representation, but have other structure such as lattice-valued coefficients. While there has been a small amount of work in this setting, it is still not very well understood how such extra information can be utilized during sampling and reconstruction. Here, we explore the problem of integer sparse reconstruction, lending insight into when this knowledge can be useful, and what types of sampling designs lead to robust reconstruction guarantees. We use a combination of combinatorial, probabilistic and number-theoretic methods to discuss existence and some constructions of such sensing matrices with concrete examples. We also prove sparse versions of Minkowski's Convex Body and Linear Forms theorems that exhibit some limitations of this framework. △ Less

Submitted 4 January, 2018; originally announced January 2018.

MSC Class: 41A46; 68Q25; 68W20

arXiv:1711.02743 [pdf, other]

Sparse Randomized Kaczmarz for Support Recovery of Jointly Sparse Corrupted Multiple Measurement Vectors

Authors: Natalie Durgin, Rachel Grotheer, Chenxi Huang, Shuang Li, Anna Ma, Deanna Needell, **g Qin

Abstract: While single measurement vector (SMV) models have been widely studied in signal processing, there is a surging interest in addressing the multiple measurement vectors (MMV) problem. In the MMV setting, more than one measurement vector is available and the multiple signals to be recovered share some commonalities such as a common support. Applications in which MMV is a naturally occurring phenomeno… ▽ More While single measurement vector (SMV) models have been widely studied in signal processing, there is a surging interest in addressing the multiple measurement vectors (MMV) problem. In the MMV setting, more than one measurement vector is available and the multiple signals to be recovered share some commonalities such as a common support. Applications in which MMV is a naturally occurring phenomenon include online streaming, medical imaging, and video recovery. This work presents a stochastic iterative algorithm for the support recovery of jointly sparse corrupted MMV. We present a variant of the Sparse Randomized Kaczmarz algorithm for corrupted MMV and compare our proposed method with an existing Kaczmarz type algorithm for MMV problems. We also showcase the usefulness of our approach in the online (streaming) setting and provide empirical evidence that suggests the robustness of the proposed method to the distribution of the corruption and the number of corruptions occurring. △ Less

Submitted 14 June, 2018; v1 submitted 7 November, 2017; originally announced November 2017.

Comments: 13 pages, 6 figures

arXiv:1711.01521 [pdf, other]

Stochastic Greedy Algorithms For Multiple Measurement Vectors

Authors: **g Qin, Shuang Li, Deanna Needell, Anna Ma, Rachel Grotheer, Chenxi Huang, Natalie Durgin

Abstract: Sparse representation of a single measurement vector (SMV) has been explored in a variety of compressive sensing applications. Recently, SMV models have been extended to solve multiple measurement vectors (MMV) problems, where the underlying signal is assumed to have joint sparse structures. To circumvent the NP-hardness of the $\ell_0$ minimization problem, many deterministic MMV algorithms solve… ▽ More Sparse representation of a single measurement vector (SMV) has been explored in a variety of compressive sensing applications. Recently, SMV models have been extended to solve multiple measurement vectors (MMV) problems, where the underlying signal is assumed to have joint sparse structures. To circumvent the NP-hardness of the $\ell_0$ minimization problem, many deterministic MMV algorithms solve the convex relaxed models with limited efficiency. In this paper, we develop stochastic greedy algorithms for solving the joint sparse MMV reconstruction problem. In particular, we propose the MMV Stochastic Iterative Hard Thresholding (MStoIHT) and MMV Stochastic Gradient Matching Pursuit (MStoGradMP) algorithms, and we also utilize the mini-batching technique to further improve their performance. Convergence analysis indicates that the proposed algorithms are able to converge faster than their SMV counterparts, i.e., concatenated StoIHT and StoGradMP, under certain conditions. Numerical experiments have illustrated the superior effectiveness of the proposed algorithms over their SMV counterparts. △ Less

Submitted 22 August, 2020; v1 submitted 4 November, 2017; originally announced November 2017.

MSC Class: 68W20; 94A12; 47N10

arXiv:1710.00034 [pdf, other]

Micro-optical Tandem Luminescent Solar Concentrators

Authors: David R. Needell, Ognjen Ilic, Colton R. Bukowsky, Zach Nett, Lu Xu, Junwen He, Haley Bauser, Benjamin G. Lee, John F. Geisz, Ralph G. Nuzzo, A. Paul Alivisatos, Harry A. Atwater

Abstract: Traditional concentrating photovoltaic (CPV) systems utilize multijunction cells to minimize thermalization losses, but cannot efficiently capture diffuse sunlight, which contributes to a high levelized cost of energy (LCOE) and limits their use to geographical regions with high direct sunlight insolation. Luminescent solar concentrators (LSCs) harness light generated by luminophores embedded in a… ▽ More Traditional concentrating photovoltaic (CPV) systems utilize multijunction cells to minimize thermalization losses, but cannot efficiently capture diffuse sunlight, which contributes to a high levelized cost of energy (LCOE) and limits their use to geographical regions with high direct sunlight insolation. Luminescent solar concentrators (LSCs) harness light generated by luminophores embedded in a light-trap** waveguide to concentrate light onto smaller cells. LSCs can absorb both direct and diffuse sunlight, and thus can operate as flat plate receivers at a fixed tilt and with a conventional module form factor. However, current LSCs experience significant power loss through parasitic luminophore absorption and incomplete light trap** by the optical waveguide. Here we introduce a tandem LSC device architecture that overcomes both of these limitations, consisting of a PLMA polymer layer with embedded CdSe/CdS quantum dot (QD) luminophores and InGaP micro-cells, which serve as a high bandgap absorber on top of a conventional Si photovoltaic. We experimentally synthesize CdSe/CdS QDs with exceptionally high quantum-yield (99%) and ultra-narrowband emission optimally matched to fabricated III-V InGaP micro-cells. Using a Monte Carlo ray-tracing model, we show the radiative limit power conversion efficiency for a module with these components to be 30.8% diffuse sunlight conditions. These results indicate that a tandem LSC-on-Si architecture could significantly improve upon the efficiency of a conventional Si photovoltaic module with simple and straightforward alterations of the module lamination steps of a Si photovoltaic manufacturing process, with promise for widespread module deployment across diverse geographical regions and energy markets. △ Less

Submitted 5 September, 2017; originally announced October 2017.

Showing 51–100 of 156 results for author: Needell, D