-
F-802.11P: A Fuzzy Enhancement for ieee 802.11P in Vehicle-TO-Everything Communications
Authors:
Hamdy A. M. Sayedahmed,
Emadeldin M. Elgamal,
Hesham A. Hefny
Abstract:
Vehicle-to-Everything communications (V2X) are becoming increasingly popular as a solution for safer roads and better traffic management. One of the essential protocols in V2X is the Dedicated Short Range Communication (DSRC) protocol suite. DSRC includes the IEEE 802.11p protocol that operates at the medium access control (MAC) and physical (PHY) layers. Upon collision, the IEEE 802.11p MAC layer…
▽ More
Vehicle-to-Everything communications (V2X) are becoming increasingly popular as a solution for safer roads and better traffic management. One of the essential protocols in V2X is the Dedicated Short Range Communication (DSRC) protocol suite. DSRC includes the IEEE 802.11p protocol that operates at the medium access control (MAC) and physical (PHY) layers. Upon collision, the IEEE 802.11p MAC layer applies a carrier sense multiple access/collision avoidance (CSMA/CA) mechanism that randomly selects a backoff time to re-check the channel activity and then retransmit. However, the random selection of the backoff time may lead to further packet collisions that decrease the utilization of the communication channel, which suffers from a limited bandwidth in the first place. This paper proposes a fuzzy model based on rational decision-making, which we call F-802.11p, to improve the IEEE 802.11p protocol backoff time selection by limiting the IEEE 802.11p beacon messages to better use of the available bandwidth. A simulation study presents the evaluation of our work compared to IEEE 802.11p. We deployed the simulation software in two scenarios: the Veins Framework map and the map of New Administrative Cairo in Egypt. We base our comparison on slots backoff, times into backoff, PHY busy time, MAC busy time, total lost packets, and generated/received beacon messages. Simulation results show that both protocols have comparable results in slots backoff, times into back off, and the generated beacon messages. At the same time, our F-802.11p significantly outperforms the IEEE 802.11p in PHY busy time, MAC busy time, total lost packets, and the received beacon messages in both scenarios.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
A Proposed Fuzzy Logic Approach for Conserving the Energy of Data Transmission in the Temperature Monitoring Systems of Internet of Things
Authors:
Noha Elqeblawy,
Ammar Mohammed,
Hesham A. Hefny
Abstract:
One of the primary challenges facing the Internet of Things is the reservation and efficient consumption of energy resources, especially in those types of applications that require continuous monitoring or suffer from lacking ongoing energy resources. Despite this, the indoor temperature and humidity monitoring systems are unconcerned about the insignificant amount of energy consumed during critic…
▽ More
One of the primary challenges facing the Internet of Things is the reservation and efficient consumption of energy resources, especially in those types of applications that require continuous monitoring or suffer from lacking ongoing energy resources. Despite this, the indoor temperature and humidity monitoring systems are unconcerned about the insignificant amount of energy consumed during critical times when sending unimportant or useless data to the control rooms servers. This paper proposes a fuzzy logic-based approach for reducing the amount of energy spent in indoor temperature and humidity monitoring systems by filtering data that is sent to servers based on several surrounding circumstances such as time of data recording and current energy consumption amount while maintaining constant monitoring. The experimental results on the Appliances Energy Prediction dataset show that the proposed fuzzy-based approach successfully reduces energy consumption in temperature and humidity monitoring systems by 11.8% .
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
MultiPath++: Efficient Information Fusion and Trajectory Aggregation for Behavior Prediction
Authors:
Balakrishnan Varadarajan,
Ahmed Hefny,
Avikalp Srivastava,
Khaled S. Refaat,
Nigamaa Nayakanti,
Andre Cornman,
Kan Chen,
Bertrand Douillard,
Chi Pang Lam,
Dragomir Anguelov,
Benjamin Sapp
Abstract:
Predicting the future behavior of road users is one of the most challenging and important problems in autonomous driving. Applying deep learning to this problem requires fusing heterogeneous world state in the form of rich perception signals and map information, and inferring highly multi-modal distributions over possible futures. In this paper, we present MultiPath++, a future prediction model th…
▽ More
Predicting the future behavior of road users is one of the most challenging and important problems in autonomous driving. Applying deep learning to this problem requires fusing heterogeneous world state in the form of rich perception signals and map information, and inferring highly multi-modal distributions over possible futures. In this paper, we present MultiPath++, a future prediction model that achieves state-of-the-art performance on popular benchmarks. MultiPath++ improves the MultiPath architecture by revisiting many design choices. The first key design difference is a departure from dense image-based encoding of the input world state in favor of a sparse encoding of heterogeneous scene elements: MultiPath++ consumes compact and efficient polylines to describe road features, and raw agent state information directly (e.g., position, velocity, acceleration). We propose a context-aware fusion of these elements and develop a reusable multi-context gating fusion component. Second, we reconsider the choice of pre-defined, static anchors, and develop a way to learn latent anchor embeddings end-to-end in the model. Lastly, we explore ensembling and output aggregation techniques -- common in other ML domains -- and find effective variants for our probabilistic multimodal output representation. We perform an extensive ablation on these design choices, and show that our proposed model achieves state-of-the-art performance on the Argoverse Motion Forecasting Competition and the Waymo Open Dataset Motion Prediction Challenge.
△ Less
Submitted 21 December, 2021; v1 submitted 29 November, 2021;
originally announced November 2021.
-
An Intelligent and Low-cost Eye-tracking System for Motorized Wheelchair Control
Authors:
Mahmoud Dahmani,
Muhammad E. H. Chowdhury,
Amith Khandakar,
Tawsifur Rahman,
Khaled Al-Jayyousi,
Abdalla Hefny,
Serkan Kiranyaz
Abstract:
In the 34 developed and 156 develo** countries, there are about 132 million disabled people who need a wheelchair constituting 1.86% of the world population. Moreover, there are millions of people suffering from diseases related to motor disabilities, which cause inability to produce controlled movement in any of the limbs or even head.The paper proposes a system to aid people with motor disabil…
▽ More
In the 34 developed and 156 develo** countries, there are about 132 million disabled people who need a wheelchair constituting 1.86% of the world population. Moreover, there are millions of people suffering from diseases related to motor disabilities, which cause inability to produce controlled movement in any of the limbs or even head.The paper proposes a system to aid people with motor disabilities by restoring their ability to move effectively and effortlessly without having to rely on others utilizing an eye-controlled electric wheelchair. The system input was images of the users eye that were processed to estimate the gaze direction and the wheelchair was moved accordingly. To accomplish such a feat, four user-specific methods were developed, implemented and tested; all of which were based on a benchmark database created by the authors.The first three techniques were automatic, employ correlation and were variants of template matching, while the last one uses convolutional neural networks (CNNs). Different metrics to quantitatively evaluate the performance of each algorithm in terms of accuracy and latency were computed and overall comparison is presented. CNN exhibited the best performance (i.e. 99.3% classification accuracy), and thus it was the model of choice for the gaze estimator, which commands the wheelchair motion. The system was evaluated carefully on 8 subjects achieving 99% accuracy in changing illumination conditions outdoor and indoor. This required modifying a motorized wheelchair to adapt it to the predictions output by the gaze estimation algorithm. The wheelchair control can bypass any decision made by the gaze estimator and immediately halt its motion with the help of an array of proximity sensors, if the measured distance goes below a well-defined safety margin.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
Recurrent Predictive State Policy Networks
Authors:
Ahmed Hefny,
Zita Marinho,
Wen Sun,
Siddhartha Srinivasa,
Geoffrey Gordon
Abstract:
We introduce Recurrent Predictive State Policy (RPSP) networks, a recurrent architecture that brings insights from predictive state representations to reinforcement learning in partially observable environments. Predictive state policy networks consist of a recursive filter, which keeps track of a belief about the state of the environment, and a reactive policy that directly maps beliefs to action…
▽ More
We introduce Recurrent Predictive State Policy (RPSP) networks, a recurrent architecture that brings insights from predictive state representations to reinforcement learning in partially observable environments. Predictive state policy networks consist of a recursive filter, which keeps track of a belief about the state of the environment, and a reactive policy that directly maps beliefs to actions, to maximize the cumulative reward. The recursive filter leverages predictive state representations (PSRs) (Rosencrantz and Gordon, 2004; Sun et al., 2016) by modeling predictive state-- a prediction of the distribution of future observations conditioned on history and future actions. This representation gives rise to a rich class of statistically consistent algorithms (Hefny et al., 2018) to initialize the recursive filter. Predictive state serves as an equivalent representation of a belief state. Therefore, the policy component of the RPSP-network can be purely reactive, simplifying training while still allowing optimal behaviour. Moreover, we use the PSR interpretation during training as well, by incorporating prediction error in the loss function. The entire network (recursive filter and reactive policy) is still differentiable and can be trained using gradient based methods. We optimize our policy using a combination of policy gradient based on rewards (Williams, 1992) and gradient descent based on prediction error. We show the efficacy of RPSP-networks under partial observability on a set of robotic control tasks from OpenAI Gym. We empirically show that RPSP-networks perform well compared with memory-preserving networks such as GRUs, as well as finite memory models, being the overall best performing method.
△ Less
Submitted 4 March, 2018;
originally announced March 2018.
-
Swarm Intelligence in Semi-supervised Classification
Authors:
Shahira Shaaban Azab,
Hesham Ahmed Hefny
Abstract:
This Paper represents a literature review of Swarm intelligence algorithm in the area of semi-supervised classification. There are many research papers for applying swarm intelligence algorithms in the area of machine learning. Some algorithms of SI are applied in the area of ML either solely or hybrid with other ML algorithms. SI algorithms are also used for tuning parameters of ML algorithm, or…
▽ More
This Paper represents a literature review of Swarm intelligence algorithm in the area of semi-supervised classification. There are many research papers for applying swarm intelligence algorithms in the area of machine learning. Some algorithms of SI are applied in the area of ML either solely or hybrid with other ML algorithms. SI algorithms are also used for tuning parameters of ML algorithm, or as a backbone for ML algorithms. This paper introduces a brief literature review for applying swarm intelligence algorithms in the field of semi-supervised learning
△ Less
Submitted 3 June, 2017;
originally announced June 2017.
-
Center of Gravity PSO for Partitioning Clustering
Authors:
Shahira Shaaban Azab,
Hesham Ahmed Hefny
Abstract:
This paper presents the local best model of PSO for partition-based clustering. The proposed model gets rid off the drawbacks of gbest PSO for clustering. The model uses a pre-specified number of clusters K. The LPOSC has K neighborhoods. Each neighborhood represents one of the clusters. The goal of the particles in each neighborhood is optimizing the position of the centroid of the cluster. The p…
▽ More
This paper presents the local best model of PSO for partition-based clustering. The proposed model gets rid off the drawbacks of gbest PSO for clustering. The model uses a pre-specified number of clusters K. The LPOSC has K neighborhoods. Each neighborhood represents one of the clusters. The goal of the particles in each neighborhood is optimizing the position of the centroid of the cluster. The performance of the proposed algorithms is measured using adjusted rand index. The results is compared with k-means and global best model of PSO.
△ Less
Submitted 14 June, 2017; v1 submitted 3 June, 2017;
originally announced June 2017.
-
Semi-supervised Classification: Cluster and Label Approach using Particle Swarm Optimization
Authors:
Shahira Shaaban Azab,
Mohamed Farouk Abdel Hady,
Hesham Ahmed Hefny
Abstract:
Classification predicts classes of objects using the knowledge learned during the training phase. This process requires learning from labeled samples. However, the labeled samples usually limited. Annotation process is annoying, tedious, expensive, and requires human experts. Meanwhile, unlabeled data is available and almost free. Semi-supervised learning approaches make use of both labeled and un…
▽ More
Classification predicts classes of objects using the knowledge learned during the training phase. This process requires learning from labeled samples. However, the labeled samples usually limited. Annotation process is annoying, tedious, expensive, and requires human experts. Meanwhile, unlabeled data is available and almost free. Semi-supervised learning approaches make use of both labeled and unlabeled data. This paper introduces cluster and label approach using PSO for semi-supervised classification. PSO is competitive to traditional clustering algorithms. A new local best PSO is presented to cluster the unlabeled data. The available labeled data guides the learning process. The experiments are conducted using four state-of-the-art datasets from different domains. The results compared with Label Propagation a popular semi-supervised classifier and two state-of-the-art supervised classification models, namely k-nearest neighbors and decision trees. The experiments show the efficiency of the proposed model.
△ Less
Submitted 3 June, 2017;
originally announced June 2017.
-
Predictive State Recurrent Neural Networks
Authors:
Carlton Downey,
Ahmed Hefny,
Boyue Li,
Byron Boots,
Geoffrey Gordon
Abstract:
We present a new model, Predictive State Recurrent Neural Networks (PSRNNs), for filtering and prediction in dynamical systems. PSRNNs draw on insights from both Recurrent Neural Networks (RNNs) and Predictive State Representations (PSRs), and inherit advantages from both types of models. Like many successful RNN architectures, PSRNNs use (potentially deeply composed) bilinear transfer functions t…
▽ More
We present a new model, Predictive State Recurrent Neural Networks (PSRNNs), for filtering and prediction in dynamical systems. PSRNNs draw on insights from both Recurrent Neural Networks (RNNs) and Predictive State Representations (PSRs), and inherit advantages from both types of models. Like many successful RNN architectures, PSRNNs use (potentially deeply composed) bilinear transfer functions to combine information from multiple sources. We show that such bilinear functions arise naturally from state updates in Bayes filters like PSRs, in which observations can be viewed as gating belief states. We also show that PSRNNs can be learned effectively by combining Backpropogation Through Time (BPTT) with an initialization derived from a statistically consistent learning algorithm for PSRs called two-stage regression (2SR). Finally, we show that PSRNNs can be factorized using tensor decomposition, reducing model size and suggesting interesting connections to existing multiplicative architectures such as LSTMs. We applied PSRNNs to 4 datasets, and showed that we outperform several popular alternative approaches to modeling dynamical systems in all cases.
△ Less
Submitted 17 June, 2017; v1 submitted 25 May, 2017;
originally announced May 2017.
-
Practical Learning of Predictive State Representations
Authors:
Carlton Downey,
Ahmed Hefny,
Geoffrey Gordon
Abstract:
Over the past decade there has been considerable interest in spectral algorithms for learning Predictive State Representations (PSRs). Spectral algorithms have appealing theoretical guarantees; however, the resulting models do not always perform well on inference tasks in practice. One reason for this behavior is the mismatch between the intended task (accurate filtering or prediction) and the los…
▽ More
Over the past decade there has been considerable interest in spectral algorithms for learning Predictive State Representations (PSRs). Spectral algorithms have appealing theoretical guarantees; however, the resulting models do not always perform well on inference tasks in practice. One reason for this behavior is the mismatch between the intended task (accurate filtering or prediction) and the loss function being optimized by the algorithm (estimation error in model parameters).
A natural idea is to improve performance by refining PSRs using an algorithm such as EM. Unfortunately it is not obvious how to apply apply an EM style algorithm in the context of PSRs as the Log Likelihood is not well defined for all PSRs. We show that it is possible to overcome this problem using ideas from Predictive State Inference Machines.
We combine spectral algorithms for PSRs as a consistent and efficient initialization with PSIM-style updates to refine the resulting model parameters. By combining these two ideas we develop Inference Gradients, a simple, fast, and robust method for practical learning of PSRs. Inference Gradients performs gradient descent in the PSR parameter space to optimize an inference-based loss function like PSIM. Because Inference Gradients uses a spectral initialization we get the same consistency benefits as PSRs. We show that Inference Gradients outperforms both PSRs and PSIMs on real and synthetic data sets.
△ Less
Submitted 14 February, 2017;
originally announced February 2017.
-
An Efficient, Expressive and Local Minima-free Method for Learning Controlled Dynamical Systems
Authors:
Ahmed Hefny,
Carlton Downey,
Geoffrey J. Gordon
Abstract:
We propose a framework for modeling and estimating the state of controlled dynamical systems, where an agent can affect the system through actions and receives partial observations. Based on this framework, we propose the Predictive State Representation with Random Fourier Features (RFFPSR). A key property in RFF-PSRs is that the state estimate is represented by a conditional distribution of futur…
▽ More
We propose a framework for modeling and estimating the state of controlled dynamical systems, where an agent can affect the system through actions and receives partial observations. Based on this framework, we propose the Predictive State Representation with Random Fourier Features (RFFPSR). A key property in RFF-PSRs is that the state estimate is represented by a conditional distribution of future observations given future actions. RFF-PSRs combine this representation with moment-matching, kernel embedding and local optimization to achieve a method that enjoys several favorable qualities: It can represent controlled environments which can be affected by actions; it has an efficient and theoretically justified learning algorithm; it uses a non-parametric representation that has expressive power to represent continuous non-linear dynamics. We provide a detailed formulation, a theoretical analysis and an experimental evaluation that demonstrates the effectiveness of our method.
△ Less
Submitted 28 February, 2018; v1 submitted 12 February, 2017;
originally announced February 2017.
-
Stochastic Variance Reduction for Nonconvex Optimization
Authors:
Sashank J. Reddi,
Ahmed Hefny,
Suvrit Sra,
Barnabas Poczos,
Alex Smola
Abstract:
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary po…
▽ More
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to mini-batching in parallel settings.
△ Less
Submitted 4 April, 2016; v1 submitted 19 March, 2016;
originally announced March 2016.
-
Rows vs. Columns: Randomized Kaczmarz or Gauss-Seidel for Ridge Regression
Authors:
Ahmed Hefny,
Deanna Needell,
Aaditya Ramdas
Abstract:
The Kaczmarz and Gauss-Seidel methods aim to solve a linear $m \times n$ system $\boldsymbol{X} \boldsymbolβ = \boldsymbol{y}$ by iteratively refining the solution estimate; the former uses random rows of $\boldsymbol{X}$ {to update $\boldsymbolβ$ given the corresponding equations} and the latter uses random columns of $\boldsymbol{X}$ {to update corresponding coordinates in $\boldsymbolβ$}. Inter…
▽ More
The Kaczmarz and Gauss-Seidel methods aim to solve a linear $m \times n$ system $\boldsymbol{X} \boldsymbolβ = \boldsymbol{y}$ by iteratively refining the solution estimate; the former uses random rows of $\boldsymbol{X}$ {to update $\boldsymbolβ$ given the corresponding equations} and the latter uses random columns of $\boldsymbol{X}$ {to update corresponding coordinates in $\boldsymbolβ$}. Interest in these methods was recently revitalized by a proof of Strohmer and Vershynin showing linear convergence in expectation for a \textit{randomized} Kaczmarz method variant (RK), and a similar result for the randomized Gauss-Seidel algorithm (RGS) was later proved by Lewis and Leventhal. Recent work unified the analysis of these algorithms for the overcomplete and undercomplete systems, showing convergence to the ordinary least squares (OLS) solution and the minimum Euclidean norm solution respectively. This paper considers the natural follow-up to the OLS problem, ridge regression, which solves $(\boldsymbol{X}^* \boldsymbol{X} + λ\boldsymbol{I}) \boldsymbolβ = \boldsymbol{X}^* \boldsymbol{y}$. We present particular variants of RK and RGS for solving this system and derive their convergence rates. We compare these to a recent proposal by Ivanov and Zhdanov to solve this system, that can be interpreted as randomly sampling both rows and columns, which we argue is often suboptimal. Instead, we claim that one should always use RGS (columns) when $m > n$ and RK (rows) when $m < n$. This difference in behavior is simply related to the minimum eigenvalue of two related positive semidefinite matrices, $\boldsymbol{X}^* \boldsymbol{X} + λ\boldsymbol{I}_n$ and $\boldsymbol{X} \boldsymbol{X}^* + λ\boldsymbol{I}_m$ when $m > n$ or $m < n$.
△ Less
Submitted 11 May, 2017; v1 submitted 21 July, 2015;
originally announced July 2015.
-
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
Authors:
Sashank J. Reddi,
Ahmed Hefny,
Suvrit Sra,
Barnabás Póczos,
Alex Smola
Abstract:
We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have been shown to outperform SGD, both theoretically and empirically. However, asynchronous versions of these algorithms---a crucial requirement for modern large-scale…
▽ More
We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have been shown to outperform SGD, both theoretically and empirically. However, asynchronous versions of these algorithms---a crucial requirement for modern large-scale applications---have not been studied. We bridge this gap by presenting a unifying framework for many variance reduction techniques. Subsequently, we propose an asynchronous algorithm grounded in our framework, and prove its fast convergence. An important consequence of our general approach is that it yields asynchronous versions of variance reduction algorithms such as SVRG and SAGA as a byproduct. Our method achieves near linear speedup in sparse settings common to machine learning. We demonstrate the empirical performance of our method through a concrete realization of asynchronous SVRG.
△ Less
Submitted 24 January, 2016; v1 submitted 22 June, 2015;
originally announced June 2015.
-
Supervised Learning for Dynamical System Learning
Authors:
Ahmed Hefny,
Carlton Downey,
Geoffrey Gordon
Abstract:
Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as sparsity or structure. To address this problem…
▽ More
Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be difficult to use and extend in practice: e.g., they can make it difficult to incorporate prior information such as sparsity or structure. To address this problem, we present a new view of dynamical system learning: we show how to learn dynamical systems by solving a sequence of ordinary supervised learning problems, thereby allowing users to incorporate prior knowledge via standard techniques such as L1 regularization. Many existing spectral methods are special cases of this new framework, using linear regression as the supervised learner. We demonstrate the effectiveness of our framework by showing examples where nonlinear regression or lasso let us learn better state representations than plain linear regression does; the correctness of these instances follows directly from our general analysis.
△ Less
Submitted 4 November, 2015; v1 submitted 20 May, 2015;
originally announced May 2015.
-
Large-scale randomized-coordinate descent methods with non-separable linear constraints
Authors:
Sashank Reddi,
Ahmed Hefny,
Carlton Downey,
Avinava Dubey,
Suvrit Sra
Abstract:
We develop randomized (block) coordinate descent (CD) methods for linearly constrained convex optimization. Unlike most CD methods, we do not assume the constraints to be separable, but let them be coupled linearly. To our knowledge, ours is the first CD method that allows linear coupling constraints, without making the global iteration complexity have an exponential dependence on the number of co…
▽ More
We develop randomized (block) coordinate descent (CD) methods for linearly constrained convex optimization. Unlike most CD methods, we do not assume the constraints to be separable, but let them be coupled linearly. To our knowledge, ours is the first CD method that allows linear coupling constraints, without making the global iteration complexity have an exponential dependence on the number of constraints. We present algorithms and analysis for four key problem scenarios: (i) smooth; (ii) smooth + nonsmooth separable; (iii) asynchronous parallel; and (iv) stochastic. We illustrate empirical behavior of our algorithms by simulation experiments.
△ Less
Submitted 10 June, 2015; v1 submitted 9 September, 2014;
originally announced September 2014.
-
A non-parametric mixture model for topic modeling over time
Authors:
Avinava Dubey,
Ahmed Hefny,
Sinead Williamson,
Eric P. Xing
Abstract:
A single, stationary topic model such as latent Dirichlet allocation is inappropriate for modeling corpora that span long time periods, as the popularity of topics is likely to change over time. A number of models that incorporate time have been proposed, but in general they either exhibit limited forms of temporal variation, or require computationally expensive inference methods. In this paper we…
▽ More
A single, stationary topic model such as latent Dirichlet allocation is inappropriate for modeling corpora that span long time periods, as the popularity of topics is likely to change over time. A number of models that incorporate time have been proposed, but in general they either exhibit limited forms of temporal variation, or require computationally expensive inference methods. In this paper we propose non-parametric Topics over Time (npTOT), a model for time-varying topics that allows an unbounded number of topics and exible distribution over the temporal variations in those topics' popularity. We develop a collapsed Gibbs sampler for the proposed model and compare against existing models on synthetic and real document sets.
△ Less
Submitted 21 August, 2012;
originally announced August 2012.