Search | arXiv e-print repository

Learning with Value-Ramp

Authors: Tom J. Ameloot, Jan Van den Bussche

Abstract: We study a learning principle based on the intuition of forming ramps. The agent tries to follow an increasing sequence of values until the agent meets a peak of reward. The resulting Value-Ramp algorithm is natural, easy to configure, and has a robust implementation with natural numbers. We study a learning principle based on the intuition of forming ramps. The agent tries to follow an increasing sequence of values until the agent meets a peak of reward. The resulting Value-Ramp algorithm is natural, easy to configure, and has a robust implementation with natural numbers. △ Less

Submitted 23 April, 2017; v1 submitted 12 August, 2016; originally announced August 2016.

Comments: Version 2: fixed notation in definition of transition + clarified a sentence in the Introduction

arXiv:1605.04691 [pdf, other]

On Avoidance Learning with Partial Observability

Authors: Tom J. Ameloot

Abstract: We study a framework where agents have to avoid aversive signals. The agents are given only partial information, in the form of features that are projections of task states. Additionally, the agents have to cope with non-determinism, defined as unpredictability on the way that actions are executed. The goal of each agent is to define its behavior based on feature-action pairs that reliably avoid a… ▽ More We study a framework where agents have to avoid aversive signals. The agents are given only partial information, in the form of features that are projections of task states. Additionally, the agents have to cope with non-determinism, defined as unpredictability on the way that actions are executed. The goal of each agent is to define its behavior based on feature-action pairs that reliably avoid aversive signals. We study a learning algorithm, called A-learning, that exhibits fixpoint convergence, where the belief of the allowed feature-action pairs eventually becomes fixed. A-learning is parameter-free and easy to implement. △ Less

Submitted 16 May, 2016; originally announced May 2016.

arXiv:1511.08724 [pdf, other]

On the convergence of cycle detection for navigational reinforcement learning

Authors: Tom J. Ameloot, Jan Van den Bussche

Abstract: We consider a reinforcement learning framework where agents have to navigate from start states to goal states. We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible. Reducible tasks have an acyclic solution. We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point i… ▽ More We consider a reinforcement learning framework where agents have to navigate from start states to goal states. We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible. Reducible tasks have an acyclic solution. We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point in a simulation. Our result demonstrates that even simple algorithms can be successful in learning a large class of nontrivial tasks. In addition, our framework is elementary in the sense that we only use basic concepts to formally prove convergence. △ Less

Submitted 5 January, 2016; v1 submitted 27 November, 2015; originally announced November 2015.

arXiv:1507.05539 [pdf, ps, other]

doi 10.1017/S1471068415000381

Putting Logic-Based Distributed Systems on Stable Grounds

Authors: Tom J. Ameloot, Jan Van den Bussche, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein

Abstract: In the Declarative Networking paradigm, Datalog-like languages are used to express distributed computations. Whereas recently formal operational semantics for these languages have been developed, a corresponding declarative semantics has been lacking so far. The challenge is to capture precisely the amount of nondeterminism that is inherent to distributed computations due to concurrency, networkin… ▽ More In the Declarative Networking paradigm, Datalog-like languages are used to express distributed computations. Whereas recently formal operational semantics for these languages have been developed, a corresponding declarative semantics has been lacking so far. The challenge is to capture precisely the amount of nondeterminism that is inherent to distributed computations due to concurrency, networking delays, and asynchronous communication. This paper shows how a declarative, model-based semantics can be obtained by simply using the well-known stable model semantics for Datalog with negation. We show that the model-based semantics matches previously proposed formal operational semantics. △ Less

Submitted 25 July, 2015; v1 submitted 20 July, 2015; originally announced July 2015.

Comments: To appear in Theory and Practice of Logic Programming (TPLP)

Journal ref: Theory and Practice of Logic Programming 16 (2016) 378-417

arXiv:1502.06094 [pdf, other]

doi 10.1162/NECO_a_00789

Positive Neural Networks in Discrete Time Implement Monotone-Regular Behaviors

Authors: Tom J. Ameloot, Jan Van den Bussche

Abstract: We study the expressive power of positive neural networks. The model uses positive connection weights and multiple input neurons. Different behaviors can be expressed by varying the connection weights. We show that in discrete time, and in absence of noise, the class of positive neural networks captures the so-called monotone-regular behaviors, that are based on regular languages. A finer picture… ▽ More We study the expressive power of positive neural networks. The model uses positive connection weights and multiple input neurons. Different behaviors can be expressed by varying the connection weights. We show that in discrete time, and in absence of noise, the class of positive neural networks captures the so-called monotone-regular behaviors, that are based on regular languages. A finer picture emerges if one takes into account the delay by which a monotone-regular behavior is implemented. Each monotone-regular behavior can be implemented by a positive neural network with a delay of one time unit. Some monotone-regular behaviors can be implemented with zero delay. And, interestingly, some simple monotone-regular behaviors can not be implemented with zero delay. △ Less

Submitted 1 December, 2015; v1 submitted 21 February, 2015; originally announced February 2015.

Journal ref: Neural Computation, December 2015, Vol. 27, No. 12 , Pages 2623-2660

arXiv:1412.4030 [pdf, ps, other]

Parallel-Correctness and Transferability for Conjunctive Queries

Authors: Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, Thomas Schwentick

Abstract: A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness… ▽ More A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness condition, called parallel-correctness, for the evaluation of queries w.r.t. a distribution policy. We study the complexity of parallel-correctness for conjunctive queries as well as transferability of parallel-correctness between queries. We also investigate the complexity of transferability for certain families of distribution policies, including, for instance, the Hypercube distribution. △ Less

Submitted 5 January, 2015; v1 submitted 12 December, 2014; originally announced December 2014.

Comments: 30 pages

Showing 1–6 of 6 results for author: Ameloot, T J