Search | arXiv e-print repository

Minimal Realization Problems for Hidden Markov Models

Authors: Qingqing Huang, Rong Ge, Sham Kakade, Munther Dahleh

Abstract: Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM). Given the joint probabilities of finite length strings of the process, we are interested in finding a finite state generative model to describe the entire process. In particular, we focus on two classes of models: HMMs and quasi-HMMs, whi… ▽ More Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM). Given the joint probabilities of finite length strings of the process, we are interested in finding a finite state generative model to describe the entire process. In particular, we focus on two classes of models: HMMs and quasi-HMMs, which is a strictly larger class of models containing HMMs. In the main theorem, we show that if the random process is generated by an HMM of order less or equal than k, and whose transition and observation probability matrix are in general position, namely almost everywhere on the parameter space, both the minimal quasi-HMM realization and the minimal HMM realization can be efficiently computed based on the joint probabilities of all the length N strings, for N > 4 lceil log_d(k) rceil +1. In this paper, we also aim to compare and connect the two lines of literature: realization theory of HMMs, and the recent development in learning latent variable models with tensor decomposition techniques. △ Less

Submitted 14 December, 2015; v1 submitted 13 November, 2014; originally announced November 2014.

arXiv:1411.1488 [pdf, ps, other]

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Authors: Anima Anandkumar, Rong Ge, Majid Janzamin

Abstract: We present a novel analysis of the dynamics of tensor power iterations in the overcomplete regime where the tensor CP rank is larger than the input dimension. Finding the CP decomposition of an overcomplete tensor is NP-hard in general. We consider the case where the tensor components are randomly drawn, and show that the simple power iteration recovers the components with bounded error under mild… ▽ More We present a novel analysis of the dynamics of tensor power iterations in the overcomplete regime where the tensor CP rank is larger than the input dimension. Finding the CP decomposition of an overcomplete tensor is NP-hard in general. We consider the case where the tensor components are randomly drawn, and show that the simple power iteration recovers the components with bounded error under mild initialization conditions. We apply our analysis to unsupervised learning of latent variable models, such as multi-view mixture models and spherical Gaussian mixtures. Given the third order moment tensor, we learn the parameters using tensor power iterations. We prove it can correctly learn the model parameters when the number of hidden components $k$ is much larger than the data dimension $d$, up to $k = o(d^{1.5})$. We initialize the power iterations with data samples and prove its success under mild conditions on the signal-to-noise ratio of the samples. Our analysis significantly expands the class of latent variable models where spectral methods are applicable. Our analysis also deals with noise in the input tensor leading to sample complexity result in the application to learning latent variable models. △ Less

Submitted 14 September, 2015; v1 submitted 5 November, 2014; originally announced November 2014.

Comments: 38 pages; analysis of noise added to the previous version

arXiv:1408.0553 [pdf, ps, other]

Sample Complexity Analysis for Learning Overcomplete Latent Variable Models through Tensor Methods

Authors: Animashree Anandkumar, Rong Ge, Majid Janzamin

Abstract: We provide guarantees for learning latent variable models emphasizing on the overcomplete regime, where the dimensionality of the latent space can exceed the observed dimensionality. In particular, we consider multiview mixtures, spherical Gaussian mixtures, ICA, and sparse coding models. We provide tight concentration bounds for empirical moments through novel covering arguments. We analyze param… ▽ More We provide guarantees for learning latent variable models emphasizing on the overcomplete regime, where the dimensionality of the latent space can exceed the observed dimensionality. In particular, we consider multiview mixtures, spherical Gaussian mixtures, ICA, and sparse coding models. We provide tight concentration bounds for empirical moments through novel covering arguments. We analyze parameter recovery through a simple tensor power update algorithm. In the semi-supervised setting, we exploit the label or prior information to get a rough estimate of the model parameters, and then refine it using the tensor method on unlabeled samples. We establish that learning is possible when the number of components scales as $k=o(d^{p/2})$, where $d$ is the observed dimension, and $p$ is the order of the observed moment employed in the tensor method. Our concentration bound analysis also leads to minimax sample complexity for semi-supervised learning of spherical Gaussian mixtures. In the unsupervised setting, we use a simple initialization algorithm based on SVD of the tensor slices, and provide guarantees under the stricter condition that $k\le βd$ (where constant $β$ can be larger than $1$), where the tensor method recovers the components under a polynomial running time (and exponential in $β$). Our analysis establishes that a wide range of overcomplete latent variable models can be learned efficiently with low computational and sample complexity through tensor decomposition methods. △ Less

Submitted 16 December, 2014; v1 submitted 3 August, 2014; originally announced August 2014.

Comments: Title changed

arXiv:1402.5180 [pdf, ps, other]

Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-$1$ Updates

Authors: Animashree Anandkumar, Rong Ge, Majid Janzamin

Abstract: In this paper, we provide local and global convergence guarantees for recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the proposed algorithm is a simple alternating rank-$1$ update which is the alternating version of the tensor power iteration adapted for asymmetric tensors. Local convergence guarantees are established for third order tensors of rank $k$ in $d$ dimensions,… ▽ More In this paper, we provide local and global convergence guarantees for recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the proposed algorithm is a simple alternating rank-$1$ update which is the alternating version of the tensor power iteration adapted for asymmetric tensors. Local convergence guarantees are established for third order tensors of rank $k$ in $d$ dimensions, when $k=o \bigl( d^{1.5} \bigr)$ and the tensor components are incoherent. Thus, we can recover overcomplete tensor decomposition. We also strengthen the results to global convergence guarantees under stricter rank condition $k \le βd$ (for arbitrary constant $β> 1$) through a simple initialization procedure where the algorithm is initialized by top singular vectors of random tensor slices. Furthermore, the approximate local convergence guarantees for $p$-th order tensors are also provided under rank condition $k=o \bigl( d^{p/2} \bigr)$. The guarantees also include tight perturbation analysis given noisy tensor. △ Less

Submitted 4 March, 2015; v1 submitted 20 February, 2014; originally announced February 2014.

Comments: We have added an additional sub-algorithm to remove the (approximate) residual error left after the tensor power iteration

arXiv:1401.0579 [pdf, ps, other]

More Algorithms for Provable Dictionary Learning

Authors: Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

Abstract: In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$ (usually $m > n$, which is the overcomplete case). The goal is to learn $A$ and $x$. This problem has been studied in neuroscience, machine learning, visions, and ima… ▽ More In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$ (usually $m > n$, which is the overcomplete case). The goal is to learn $A$ and $x$. This problem has been studied in neuroscience, machine learning, visions, and image processing. In practice it is solved by heuristic algorithms and provable algorithms seemed hard to find. Recently, provable algorithms were found that work if the unknown feature vector $x$ is $\sqrt{n}$-sparse or even sparser. Spielman et al. \cite{DBLP:journals/jmlr/SpielmanWW12} did this for dictionaries where $m=n$; Arora et al. \cite{AGM} gave an algorithm for overcomplete ($m >n$) and incoherent matrices $A$; and Agarwal et al. \cite{DBLP:journals/corr/AgarwalAN13} handled a similar case but with weaker guarantees. This raised the problem of designing provable algorithms that allow sparsity $\gg \sqrt{n}$ in the hidden vector $x$. The current paper designs algorithms that allow sparsity up to $n/poly(\log n)$. It works for a class of matrices where features are individually recoverable, a new notion identified in this paper that may motivate further work. The algorithm runs in quasipolynomial time because they use limited enumeration. △ Less

Submitted 2 January, 2014; originally announced January 2014.

Comments: 23 pages

arXiv:1312.2939 [pdf, ps, other]

doi 10.1088/1367-2630/16/11/113048

Quasinormal mode approach to modelling light-emission and propagation in nanoplasmonics

Authors: Rong-Chun Ge, Philip Trost Kristensen, Jeff. F. Young, Stephen Hughes

Abstract: We describe a powerful and intuitive technique for modeling light-matter interactions in classical and quantum nanoplasmonics. Our approach uses a quasinormal mode expansion of the Green function within a metal nanoresonator of arbitrary shape, together with a Dyson equation, to derive an expression for the spontaneous decay rate and far field propagator from dipole oscillators outside resonators.… ▽ More We describe a powerful and intuitive technique for modeling light-matter interactions in classical and quantum nanoplasmonics. Our approach uses a quasinormal mode expansion of the Green function within a metal nanoresonator of arbitrary shape, together with a Dyson equation, to derive an expression for the spontaneous decay rate and far field propagator from dipole oscillators outside resonators. For a single quasinormal mode, at field positions outside the quasi-static coupling regime, we give a closed form solution for the Purcell factor and generalized effective mode volume. We augment this with an analytic expression for the divergent LDOS very near the metal surface, which allows us to derive a simple and highly accurate expression for the electric field outside the metal resonator at distances from a few nanometers to infinity. This intuitive formalism provides an enormous simplification over full numerical calculations and fixes several pending problems in quasinormal mode theory. △ Less

Submitted 2 June, 2014; v1 submitted 10 December, 2013; originally announced December 2013.

arXiv:1310.6343 [pdf, other]

Provable Bounds for Learning Some Deep Representations

Authors: Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

Abstract: We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an $n$ node multilayer neural net that has degree at most $n^γ$ for some $γ<1$ and each edge has a random edge weight in $[-1,1]$. Our algorithm learns {\em almost all} networks in this class with polynomial running time. The sample comp… ▽ More We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an $n$ node multilayer neural net that has degree at most $n^γ$ for some $γ<1$ and each edge has a random edge weight in $[-1,1]$. Our algorithm learns {\em almost all} networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model. The algorithm uses layerwise learning. It is based upon a novel idea of observing correlations among features and using these to infer the underlying edge structure via a global graph recovery procedure. The analysis of the algorithm reveals interesting structure of neural networks with random edge weights. △ Less

Submitted 23 October, 2013; originally announced October 2013.

Comments: The first 18 pages serve as an extended abstract and a 36 pages long technical appendix follows

arXiv:1308.6273 [pdf, ps, other]

New Algorithms for Learning Incoherent and Overcomplete Dictionaries

Authors: Sanjeev Arora, Rong Ge, Ankur Moitra

Abstract: In sparse recovery we are given a matrix $A$ (the dictionary) and a vector of the form $A X$ where $X$ is sparse, and the goal is to recover $X$. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, compression and super resolution, the dictionary $A$ is unknown and has to be learned from random examples of the f… ▽ More In sparse recovery we are given a matrix $A$ (the dictionary) and a vector of the form $A X$ where $X$ is sparse, and the goal is to recover $X$. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, compression and super resolution, the dictionary $A$ is unknown and has to be learned from random examples of the form $Y = AX$ where $X$ is drawn from an appropriate distribution --- this is the dictionary learning problem. In most settings, $A$ is overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman, Wang and Wright who gave an algorithm for the full-rank case, which is rarely the case in applications. Our algorithm applies to incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo. In particular, a dictionary is $μ$-incoherent if each pair of columns has inner product at most $μ/ \sqrt{n}$. The algorithm makes natural stochastic assumptions about the unknown sparse vector $X$, which can contain $k \leq c \min(\sqrt{n}/μ\log n, m^{1/2 -η})$ non-zero entries (for any $η> 0$). This is close to the best $k$ allowable by the best sparse recovery algorithms even if one knows the dictionary $A$ exactly. Moreover, both the running time and sample complexity depend on $\log 1/ε$, where $ε$ is the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on $1/ε$, and this is necessary. △ Less

Submitted 26 May, 2014; v1 submitted 28 August, 2013; originally announced August 2013.

arXiv:1304.6045 [pdf, ps, other]

doi 10.1103/PhysRevB.87.205425

Accessing quantum nanoplasmonics in a hybrid quantum-dot metal nanosystem: Mollow triplet of a quantum dot near a metal nanoparticle

Authors: Rong-Chun Ge, C. Van Vlack, P. Yao, Jeff. F. Young, S. Hughes

Abstract: We present a theoretical study of the resonance fluorescence spectra of an optically driven quantum dot placed near a single metal nanoparticle. The metallic reservoir coupling is calculated for an 8-nm metal nanoparticle using a time-convolutionless master equation approach where the exact photon reservoir function is included using Green function theory. By exciting the system coherently near th… ▽ More We present a theoretical study of the resonance fluorescence spectra of an optically driven quantum dot placed near a single metal nanoparticle. The metallic reservoir coupling is calculated for an 8-nm metal nanoparticle using a time-convolutionless master equation approach where the exact photon reservoir function is included using Green function theory. By exciting the system coherently near the nanoparticle dipole mode, we show that the driven Mollow spectrum becomes highly asymmetric due to internal coupling effects with higher-order plasmons. We also highlight regimes of resonance squeezing and broadening as well as spectral resha** through light propagation. Our master equation technique can be applied to any arbitrary material system, including lossy inhomogeneous structures, where mode expansion techniques are known to break down. △ Less

Submitted 22 April, 2013; originally announced April 2013.

Journal ref: Phys. Rev. B 87, 205425 (2013)

arXiv:1304.3365 [pdf, other]

Towards a better approximation for sparsest cut?

Authors: Sanjeev Arora, Rong Ge, Ali Kemal Sinop

Abstract: We give a new $(1+ε)$-approximation for sparsest cut problem on graphs where small sets expand significantly more than the sparsest cut (sets of size $n/r$ expand by a factor $\sqrt{\log n\log r}$ bigger, for some small $r$; this condition holds for many natural graph families). We give two different algorithms. One involves Guruswami-Sinop rounding on the level-$r$ Lasserre relaxation. The other… ▽ More We give a new $(1+ε)$-approximation for sparsest cut problem on graphs where small sets expand significantly more than the sparsest cut (sets of size $n/r$ expand by a factor $\sqrt{\log n\log r}$ bigger, for some small $r$; this condition holds for many natural graph families). We give two different algorithms. One involves Guruswami-Sinop rounding on the level-$r$ Lasserre relaxation. The other is combinatorial and involves a new notion called {\em Small Set Expander Flows} (inspired by the {\em expander flows} of ARV) which we show exists in the input graph. Both algorithms run in time $2^{O(r)} \mathrm{poly}(n)$. We also show similar approximation algorithms in graphs with genus $g$ with an analogous local expansion condition. This is the first algorithm we know of that achieves $(1+ε)$-approximation on such general family of graphs. △ Less

Submitted 11 April, 2013; originally announced April 2013.

arXiv:1302.3413 [pdf, ps, other]

doi 10.1364/OL.38.001691

Mollow "quintuplets" from coherently-excited quantum dots

Authors: Rong-Chun Ge, S. Weiler, A. Ulhaq, S. M. Ulrich, M. Jetter, P. Michler, S. Hughes

Abstract: Charge-neutral excitons in semiconductor quantum dots have a small finite energy separation caused by the anisotropic exchange splitting. Coherent excitation of neutral excitons will generally excite both exciton components, unless the excitation is parallel to one of the dipole axes. We present a polaron master equation model to describe two-exciton pum** using a coherent continuous wave pump f… ▽ More Charge-neutral excitons in semiconductor quantum dots have a small finite energy separation caused by the anisotropic exchange splitting. Coherent excitation of neutral excitons will generally excite both exciton components, unless the excitation is parallel to one of the dipole axes. We present a polaron master equation model to describe two-exciton pum** using a coherent continuous wave pump field in the presence of a realistic anisotropic exchange splitting. We predict a five-peak incoherent spectrum, thus generalizing the Mollow triplet to become a Mollow quintuplet. We experimentally confirm such spectral quintuplets for In(Ga)As quantum dots and obtain very good agreement with theory. △ Less

Submitted 14 February, 2013; originally announced February 2013.

Journal ref: Optics Letters, Vol. 38, Issue 10, pp. 1691-1693 (2013)

arXiv:1302.2684 [pdf, ps, other]

A Tensor Approach to Learning Mixed Membership Community Models

Authors: Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade

Abstract: Community detection is the task of detecting hidden communities from observed interactions. Guaranteed community detection has so far been mostly limited to models with non-overlap** communities such as the stochastic block model. In this paper, we remove this restriction, and provide guaranteed community detection for a family of probabilistic network models with overlap** communities, termed… ▽ More Community detection is the task of detecting hidden communities from observed interactions. Guaranteed community detection has so far been mostly limited to models with non-overlap** communities such as the stochastic block model. In this paper, we remove this restriction, and provide guaranteed community detection for a family of probabilistic network models with overlap** communities, termed as the mixed membership Dirichlet model, first introduced by Airoldi et al. This model allows for nodes to have fractional memberships in multiple communities and assumes that the community memberships are drawn from a Dirichlet distribution. Moreover, it contains the stochastic block model as a special case. We propose a unified approach to learning these models via a tensor spectral decomposition method. Our estimator is based on low-order moment tensor of the observed network, consisting of 3-star counts. Our learning method is fast and is based on simple linear algebraic operations, e.g. singular value decomposition and tensor power iterations. We provide guaranteed recovery of community memberships and model parameters and present a careful finite sample analysis of our learning method. As an important special case, our results match the best known scaling requirements for the (homogeneous) stochastic block model. △ Less

Submitted 24 October, 2013; v1 submitted 11 February, 2013; originally announced February 2013.

arXiv:1212.4777 [pdf, other]

A Practical Algorithm for Topic Modeling with Provable Guarantees

Authors: Sanjeev Arora, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu

Abstract: Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithm… ▽ More Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for topic model inference that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster. △ Less

Submitted 19 December, 2012; originally announced December 2012.

Comments: 26 pages

arXiv:1210.7559 [pdf, ps, other]

Tensor decompositions for learning latent variable models

Authors: Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade, Matus Telgarsky

Abstract: This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to… ▽ More This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models. △ Less

Submitted 13 November, 2014; v1 submitted 29 October, 2012; originally announced October 2012.

Journal ref: Journal of Machine Learning Research, 15(Aug):2773-2832, 2014

arXiv:1206.5349 [pdf, ps, other]

Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders

Authors: Sanjeev Arora, Rong Ge, Ankur Moitra, Sushant Sachdeva

Abstract: We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form $y = Ax + η$ where $A$ is an unknown $n \times n$ matrix and $x$ is a random variable whose components are independent and have a fourth moment strictly less than that of a standard Gaussian random variable and $η$ is an $n$-dimensio… ▽ More We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form $y = Ax + η$ where $A$ is an unknown $n \times n$ matrix and $x$ is a random variable whose components are independent and have a fourth moment strictly less than that of a standard Gaussian random variable and $η$ is an $n$-dimensional Gaussian random variable with unknown covariance $Σ$: We give an algorithm that provable recovers $A$ and $Σ$ up to an additive $ε$ and whose running time and sample complexity are polynomial in $n$ and $1 / ε$. To accomplish this, we introduce a novel "quasi-whitening" step that may be useful in other contexts in which the covariance of Gaussian noise is not known in advance. We also give a general framework for finding all local optima of a function (given an oracle for approximately finding just one) and this is a crucial step in our algorithm, one that has been overlooked in previous attempts, and allows us to control the accumulation of error when we find the columns of $A$ one by one via local search. △ Less

Submitted 11 November, 2012; v1 submitted 22 June, 2012; originally announced June 2012.

arXiv:1206.3644 [pdf, ps, other]

Efficient Quantum Ratchet

Authors: Chuan-Feng Li, Rong-Chun Ge, Guang-Can Guo

Abstract: Quantum resonance is one of the main characteristics of the quantum kicked rotor, which has been used to induce accelerated ratchet current of the particles with a generalized asymmetry potential. Here we show that by desynchronizing the kicked potentials of the flashing ratchet [Phys. Rev. Lett. 94, 110603 (2005)], new quantum resonances are stimulated to conduct directed currents more efficientl… ▽ More Quantum resonance is one of the main characteristics of the quantum kicked rotor, which has been used to induce accelerated ratchet current of the particles with a generalized asymmetry potential. Here we show that by desynchronizing the kicked potentials of the flashing ratchet [Phys. Rev. Lett. 94, 110603 (2005)], new quantum resonances are stimulated to conduct directed currents more efficiently. Most distinctly, the missed resonances $κ=1.0π$ and $κ=3.0π$ are created out to induce even larger currents. At the same time, with the help of semiclassical analysis, we prove that our result is exact rather than phenomenon induced by errors of the numerical simulation. Our discovery may be used to realize directed transport efficiently, and may also lead to a deeper understanding of symmetry breaking for the dynamical evolution. △ Less

Submitted 16 June, 2012; originally announced June 2012.

Comments: 10 pages, 4 figures. Comments welcome

arXiv:1204.1956 [pdf, ps, other]

Learning Topic Models - Going beyond SVD

Authors: Sanjeev Arora, Rong Ge, Ankur Moitra

Abstract: Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents. A number of foundational works both in machine learning and in theory have suggested a probabilistic model for documents, whereby documents arise as a convex combination of (i.e. distr… ▽ More Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents. A number of foundational works both in machine learning and in theory have suggested a probabilistic model for documents, whereby documents arise as a convex combination of (i.e. distribution on) a small number of topic vectors, each topic vector being a distribution on words (i.e. a vector of word-frequencies). Similar models have since been used in a variety of application areas; the Latent Dirichlet Allocation or LDA model of Blei et al. is especially popular. Theoretical studies of topic modeling focus on learning the model's parameters assuming the data is actually generated from it. Existing approaches for the most part rely on Singular Value Decomposition(SVD), and consequently have one of two limitations: these works need to either assume that each document contains only one topic, or else can only recover the span of the topic vectors instead of the topic vectors themselves. This paper formally justifies Nonnegative Matrix Factorization(NMF) as a main tool in this context, which is an analog of SVD where all vectors are nonnegative. Using this tool we give the first polynomial-time algorithm for learning topic models without the above two limitations. The algorithm uses a fairly mild assumption about the underlying topic matrix called separability, which is usually found to hold in real-life data. A compelling feature of our algorithm is that it generalizes to models that incorporate topic-topic correlations, such as the Correlated Topic Model and the Pachinko Allocation Model. We hope that this paper will motivate further theoretical results that use NMF as a replacement for SVD - just as NMF has come to replace SVD in many applications. △ Less

Submitted 9 April, 2012; v1 submitted 9 April, 2012; originally announced April 2012.

arXiv:1112.1831 [pdf, ps, other]

Finding Overlap** Communities in Social Networks: Toward a Rigorous Approach

Authors: Sanjeev Arora, Rong Ge, Sushant Sachdeva, Grant Schoenebeck

Abstract: A "community" in a social network is usually understood to be a group of nodes more densely connected with each other than with the rest of the network. This is an important concept in most domains where networks arise: social, technological, biological, etc. For many years algorithms for finding communities implicitly assumed communities are nonoverlap** (leading to use of clustering-based appr… ▽ More A "community" in a social network is usually understood to be a group of nodes more densely connected with each other than with the rest of the network. This is an important concept in most domains where networks arise: social, technological, biological, etc. For many years algorithms for finding communities implicitly assumed communities are nonoverlap** (leading to use of clustering-based approaches) but there is increasing interest in finding overlap** communities. A barrier to finding communities is that the solution concept is often defined in terms of an NP-complete problem such as Clique or Hierarchical Clustering. This paper seeks to initiate a rigorous approach to the problem of finding overlap** communities, where "rigorous" means that we clearly state the following: (a) the object sought by our algorithm (b) the assumptions about the underlying network (c) the (worst-case) running time. Our assumptions about the network lie between worst-case and average-case. An average case analysis would require a precise probabilistic model of the network, on which there is currently no consensus. However, some plausible assumptions about network parameters can be gleaned from a long body of work in the sociology community spanning five decades focusing on the study of individual communities and ego-centric networks. Thus our assumptions are somewhat "local" in nature. Nevertheless they suffice to permit a rigorous analysis of running time of algorithms that recover global structure. Our algorithms use random sampling similar to that in property testing and algorithms for dense graphs. However, our networks are not necessarily dense graphs, not even in local neighborhoods. Our algorithms explore a local-global relationship between ego-centric and socio-centric networks that we hope will provide a fruitful framework for future work both in computer science and sociology. △ Less

Submitted 8 December, 2011; originally announced December 2011.

Comments: 19 pages

ACM Class: F.2.2; J.4

arXiv:1111.0952 [pdf, other]

Computing a Nonnegative Matrix Factorization -- Provably

Authors: Sanjeev Arora, Rong Ge, Ravi Kannan, Ankur Moitra

Abstract: In the Nonnegative Matrix Factorization (NMF) problem we are given an $n \times m$ nonnegative matrix $M$ and an integer $r > 0$. Our goal is to express $M$ as $A W$ where $A$ and $W$ are nonnegative matrices of size $n \times r$ and $r \times m$ respectively. In some applications, it makes sense to ask instead for the product $AW$ to approximate $M$ -- i.e. (approximately) minimize… ▽ More In the Nonnegative Matrix Factorization (NMF) problem we are given an $n \times m$ nonnegative matrix $M$ and an integer $r > 0$. Our goal is to express $M$ as $A W$ where $A$ and $W$ are nonnegative matrices of size $n \times r$ and $r \times m$ respectively. In some applications, it makes sense to ask instead for the product $AW$ to approximate $M$ -- i.e. (approximately) minimize $\norm{M - AW}_F$ where $\norm{}_F$ denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where $A$ and $W$ are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant $r$. Indeed NMF is most interesting in applications precisely when $r$ is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time $(nm)^{o(r)}$, 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in $n$, $m$ and $r$ under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work. △ Less

Submitted 3 November, 2011; originally announced November 2011.

Comments: 29 pages, 3 figures

arXiv:1110.5537 [pdf, ps, other]

doi 10.1088/0256-307X/29/12/120302

Violation of Leggett-Garg inequalities in single quantum dot

Authors: Yong-Nan Sun, Yang Zou, Rong-Chun Ge, Jian-Shun Tang, Chuan-Feng Li, Guang-Can Guo

Abstract: We investigate the violation of Leggett-Garg (LG) inequalities inquantum dots with the stationarity assumption. By comparing two types of LG inequalities, we find a better one which is easier to be tested in experiment. In addition, we show that the fine-structure splitting, background noise and temperature of quantum dots all influence the violation of LG inequalities. We investigate the violation of Leggett-Garg (LG) inequalities inquantum dots with the stationarity assumption. By comparing two types of LG inequalities, we find a better one which is easier to be tested in experiment. In addition, we show that the fine-structure splitting, background noise and temperature of quantum dots all influence the violation of LG inequalities. △ Less

Submitted 25 October, 2011; v1 submitted 25 October, 2011; originally announced October 2011.

Comments: 4 pages, 5 figures

Journal ref: Chin. Phys. Lett.29,120302(2012)

arXiv:1011.5114 [pdf, ps, other]

doi 10.1088/0256-307X/28/12/120302

Non-Markovian Dynamics of Quantum and Classical Correlations in the Presence of System-Bath Coherence

Authors: Chuan-Feng Li, Hao-Tian Wang, Hong-Yuan Yuan, Rong-Chun Ge, Guang-Can Guo

Abstract: We present a detailed study of the dynamics of correlations in non-Markovian environments, applying the hierarchy equations approach. This theoretical treatment is able to take the system-bath interaction into consideration carefully. It is shown that crosses and sudden changes of classical and quantum correlations can happen if we gradually reduce the strength of the interactions between qubits.… ▽ More We present a detailed study of the dynamics of correlations in non-Markovian environments, applying the hierarchy equations approach. This theoretical treatment is able to take the system-bath interaction into consideration carefully. It is shown that crosses and sudden changes of classical and quantum correlations can happen if we gradually reduce the strength of the interactions between qubits. For some special initial states, sudden transitions between classical and quantum correlations even occur. △ Less

Submitted 23 November, 2010; originally announced November 2010.

Comments: 10 pages, 5 figures

arXiv:1010.3620 [pdf, ps, other]

Spin dynamics in the XY model

Authors: Rong-Chun Ge, Chuan-Feng Li, Guang-Can Guo

Abstract: We study the evolution of entanglement, quantum correlation and classical correlation for the one dimensional XY model in external transverse magnetic field. The system is initialized in the full polarized state along the z axis, after annealing, different sites will become entangled. We study the three kinds of correlation for both the nearest and the next-nearest neighbor sites. We find that for… ▽ More We study the evolution of entanglement, quantum correlation and classical correlation for the one dimensional XY model in external transverse magnetic field. The system is initialized in the full polarized state along the z axis, after annealing, different sites will become entangled. We study the three kinds of correlation for both the nearest and the next-nearest neighbor sites. We find that for large anisotropy parameter the quantum phase transition can be indicated by the dynamics of classical correlation between the nearest neighbor sites. We find that the dynamics of entanglement for both the nearest and next-nearest neighbor sites show significantly different behaviors with different values of magnetic field. We also find that the evolution of quantum correlation and classical correlation of the nearest neighbor sites are obviously different from those of the next-nearest neighbor sites. △ Less

Submitted 21 October, 2010; v1 submitted 18 October, 2010; originally announced October 2010.

Comments: 9 pages, new references added, some mistakes corrected

arXiv:1010.3402 [pdf, ps, other]

doi 10.1016/j.physa.2011.04.029

Non-Markovian Entanglement Sudden Death and Rebirth of a Two-Qubit System in the Presence of System-Bath Coherence

Authors: Hao-Tian Wang, Chuan-Feng Li, Yang Zou, Rong-Chun Ge, Guang-Can Guo

Abstract: We present a detailed study of the entanglement dynamics of a two-qubit system coupled to independent non-Markovian environments, employing hierarchy equations. This recently developed theoretical treatment can conveniently solve non-Markovian problems and take into consideration the correlation between the system and bath in an initial state. We concentrate on calculating the death and rebirth ti… ▽ More We present a detailed study of the entanglement dynamics of a two-qubit system coupled to independent non-Markovian environments, employing hierarchy equations. This recently developed theoretical treatment can conveniently solve non-Markovian problems and take into consideration the correlation between the system and bath in an initial state. We concentrate on calculating the death and rebirth time points of the entanglement to obtain a general view of the concurrence curve and explore the behavior of entanglement dynamics with respect to the coupling strength, the characteristic frequency of the noise bath and the environment temperature. △ Less

Submitted 17 October, 2010; originally announced October 2010.

Comments: Submitted to Europhysics Letters (Oct. 5, 2010)

arXiv:1010.3375 [pdf, ps, other]

doi 10.1103/PhysRevA.84.054302

Non-classical correlation of cascaded photon pairs emitted from quantum dot

Authors: Yang Zou, Chuan-Feng Li, **-Shi Xu, Rong-Chun Ge, Guang-Can Guo

Abstract: We studied the quantum correlation between the photon pairs generated by biexciton cascade decays of self-assembled quantum dots, and determined the temperature behavior associated with so-called sudden change of the quantum correlation. The relationship between the fine structure splitting and the sudden change temperature is also provided. Our study indicates that this correlation behavior sudde… ▽ More We studied the quantum correlation between the photon pairs generated by biexciton cascade decays of self-assembled quantum dots, and determined the temperature behavior associated with so-called sudden change of the quantum correlation. The relationship between the fine structure splitting and the sudden change temperature is also provided. Our study indicates that this correlation behavior sudden change temperature is independent on the back ground noise in the system and far lower than entanglement sudden death temperature, therefore it should be easier to observe the phenomenon of correlation sudden change in experiments than to observe entanglement sudden death. △ Less

Submitted 13 November, 2011; v1 submitted 16 October, 2010; originally announced October 2010.

Journal ref: Phys. Rev. A 84, 054302 (2011)

arXiv:1007.0669 [pdf, ps, other]

doi 10.1103/PhysRevA.81.064103

Quantum correlation and classical correlation dynamics in the spin-boson model

Authors: Rong-Chun Ge, Ming Gong, Chuan-Feng Li, **-Shi Xu, Guang-Can Guo

Abstract: We study the quantum correlation and classical correlation dynamics in a spin-boson model. For two different forms of spectral density, we obtain analytical results and show that the evolutions of both correlations depend closely on the form of the initial state. At the end of evolution, all correlations initially stored in the spin system transfer to reservoirs. It is found that for a large famil… ▽ More We study the quantum correlation and classical correlation dynamics in a spin-boson model. For two different forms of spectral density, we obtain analytical results and show that the evolutions of both correlations depend closely on the form of the initial state. At the end of evolution, all correlations initially stored in the spin system transfer to reservoirs. It is found that for a large family of initial states, quantum correlation remains equal to the classical correlation during the course of evolution. In addition, there is no increase in the correlations during the course of evolution. △ Less

Submitted 5 July, 2010; originally announced July 2010.

Comments: 10 pages, 5 figures

Journal ref: Phys. Rev. A 81, 064103 (2010)

Showing 151–175 of 175 results for author: Ge, R