-
Continual Reinforcement Learning in 3D Non-stationary Environments
Authors:
Vincenzo Lomonaco,
Karan Desai,
Eugenio Culurciello,
Davide Maltoni
Abstract:
High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stati…
▽ More
High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stationary and subject to unpredictable, frequent changes. In this paper we propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Then, we introduce an end-to-end model-free continual reinforcement learning strategy showing competitive results with respect to four different baselines and not requiring any access to additional supervised signals, previously encountered environmental conditions or observations.
△ Less
Submitted 21 April, 2020; v1 submitted 24 May, 2019;
originally announced May 2019.
-
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
Authors:
Ramakrishna Vedantam,
Karan Desai,
Stefan Lee,
Marcus Rohrbach,
Dhruv Batra,
Devi Parikh
Abstract:
We propose a new class of probabilistic neural-symbolic models, that have symbolic functional programs as a latent, stochastic variable. Instantiated in the context of visual question answering, our probabilistic formulation offers two key conceptual advantages over prior neural-symbolic models for VQA. Firstly, the programs generated by our model are more understandable while requiring lesser num…
▽ More
We propose a new class of probabilistic neural-symbolic models, that have symbolic functional programs as a latent, stochastic variable. Instantiated in the context of visual question answering, our probabilistic formulation offers two key conceptual advantages over prior neural-symbolic models for VQA. Firstly, the programs generated by our model are more understandable while requiring lesser number of teaching examples. Secondly, we show that one can pose counterfactual scenarios to the model, to probe its beliefs on the programs that could lead to a specified answer given an image. Our results on the CLEVR and SHAPES datasets verify our hypotheses, showing that the model gets better program (and answer) prediction accuracy even in the low data regime, and allows one to probe the coherence and consistency of reasoning performed.
△ Less
Submitted 27 June, 2019; v1 submitted 20 February, 2019;
originally announced February 2019.
-
Insights from the Wikipedia Contest (IEEE Contest for Data Mining 2011)
Authors:
Kalpit V Desai,
Roopesh Ranjan
Abstract:
The Wikimedia Foundation has recently observed that newly joining editors on Wikipedia are increasingly failing to integrate into the Wikipedia editors' community, i.e. the community is becoming increasingly harder to penetrate. To sustain healthy growth of the community, the Wikimedia Foundation aims to quantitatively understand the factors that determine the editing behavior, and explain why mos…
▽ More
The Wikimedia Foundation has recently observed that newly joining editors on Wikipedia are increasingly failing to integrate into the Wikipedia editors' community, i.e. the community is becoming increasingly harder to penetrate. To sustain healthy growth of the community, the Wikimedia Foundation aims to quantitatively understand the factors that determine the editing behavior, and explain why most new editors become inactive soon after joining. As a step towards this broader goal, the Wikimedia foundation sponsored the ICDM (IEEE International Conference for Data Mining) contest for the year 2011.
The objective for the participants was to develop models to predict the number of edits that an editor will make in future five months based on the editing history of the editor. Here we describe the approach we followed for develo** predictive models towards this goal, the results that we obtained and the modeling insights that we gained from this exercise. In addition, towards the broader goal of Wikimedia Foundation, we also summarize the factors that emerged during our model building exercise as powerful predictors of future editing activity.
△ Less
Submitted 7 January, 2014;
originally announced May 2014.
-
Tellipsoid: Exploiting inter-gene correlation for improved detection of differential gene expression
Authors:
Keyur Desai,
J. R. Deller, Jr.,
J. Justin McCormick
Abstract:
Motivation: Algorithms for differential analysis of microarray data are vital to modern biomedical research. Their accuracy strongly depends on effective treatment of inter-gene correlation. Correlation is ordinarily accounted for in terms of its effect on significance cut-offs. In this paper it is shown that correlation can, in fact, be exploited {to share information across tests}, which, in t…
▽ More
Motivation: Algorithms for differential analysis of microarray data are vital to modern biomedical research. Their accuracy strongly depends on effective treatment of inter-gene correlation. Correlation is ordinarily accounted for in terms of its effect on significance cut-offs. In this paper it is shown that correlation can, in fact, be exploited {to share information across tests}, which, in turn, can increase statistical power.
Results: Vastly and demonstrably improved differential analysis approaches are the result of combining identifiability (the fact that in most microarray data sets, a large proportion of genes can be identified a priori as non-differential) with optimization criteria that incorporate correlation. As a special case, we develop a method which builds upon the widely used two-sample t-statistic based approach and uses the Mahalanobis distance as an optimality criterion. Results on the prostate cancer data of Singh et al. (2002) suggest that the proposed method outperforms all published approaches in terms of statistical power.
Availability: The proposed algorithm is implemented in MATLAB and in R. The software, called Tellipsoid, and relevant data sets are available at http://www.egr.msu.edu/~desaikey
△ Less
Submitted 20 February, 2008;
originally announced February 2008.