-
CUER: Corrected Uniform Experience Replay for Off-Policy Continuous Deep Reinforcement Learning Algorithms
Authors:
Arda Sarp Yenicesu,
Furkan B. Mutlu,
Suleyman S. Kozat,
Ozgur S. Oguz
Abstract:
The utilization of the experience replay mechanism enables agents to effectively leverage their experiences on several occasions. In previous studies, the sampling probability of the transitions was modified based on their relative significance. The process of reassigning sample probabilities for every transition in the replay buffer after each iteration is considered extremely inefficient. Hence,…
▽ More
The utilization of the experience replay mechanism enables agents to effectively leverage their experiences on several occasions. In previous studies, the sampling probability of the transitions was modified based on their relative significance. The process of reassigning sample probabilities for every transition in the replay buffer after each iteration is considered extremely inefficient. Hence, in order to enhance computing efficiency, experience replay prioritization algorithms reassess the importance of a transition as it is sampled. However, the relative importance of the transitions undergoes dynamic adjustments when the agent's policy and value function are iteratively updated. Furthermore, experience replay is a mechanism that retains the transitions generated by the agent's past policies, which could potentially diverge significantly from the agent's most recent policy. An increased deviation from the agent's most recent policy results in a greater frequency of off-policy updates, which has a negative impact on the agent's performance. In this paper, we develop a novel algorithm, Corrected Uniform Experience Replay (CUER), which stochastically samples the stored experience while considering the fairness among all other experiences without ignoring the dynamic nature of the transition importance by making sampled state distribution more on-policy. CUER provides promising improvements for off-policy continuous control algorithms in terms of sample efficiency, final performance, and stability of the policy during the training.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Binary Feature Mask Optimization for Feature Selection
Authors:
Mehmet E. Lorasdagi,
Mehmet Y. Turali,
Ali T. Koc,
Suleyman S. Kozat
Abstract:
We investigate feature selection problem for generic machine learning (ML) models. We introduce a novel framework that selects features considering the predictions of the model. Our framework innovates by using a novel feature masking approach to eliminate the features during the selection process, instead of completely removing them from the dataset. This allows us to use the same ML model during…
▽ More
We investigate feature selection problem for generic machine learning (ML) models. We introduce a novel framework that selects features considering the predictions of the model. Our framework innovates by using a novel feature masking approach to eliminate the features during the selection process, instead of completely removing them from the dataset. This allows us to use the same ML model during feature selection, unlike other feature selection methods where we need to train the ML model again as the dataset has different dimensions on each iteration. We obtain the mask operator using the predictions of the ML model, which offers a comprehensive view on the subsets of the features essential for the predictive performance of the model. A variety of approaches exist in the feature selection literature. However, no study has introduced a training-free framework for a generic ML model to select features while considering the importance of the feature subsets as a whole, instead of focusing on the individual features. We demonstrate significant performance improvements on the real-life datasets under different settings using LightGBM and Multi-Layer Perceptron as our ML models. Additionally, we openly share the implementation code for our methods to encourage the research and the contributions in this area.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
AFS-BM: Enhancing Model Performance through Adaptive Feature Selection with Binary Masking
Authors:
Mehmet Y. Turali,
Mehmet E. Lorasdagi,
Ali T. Koc,
Suleyman S. Kozat
Abstract:
We study the problem of feature selection in general machine learning (ML) context, which is one of the most critical subjects in the field. Although, there exist many feature selection methods, however, these methods face challenges such as scalability, managing high-dimensional data, dealing with correlated features, adapting to variable feature importance, and integrating domain knowledge. To t…
▽ More
We study the problem of feature selection in general machine learning (ML) context, which is one of the most critical subjects in the field. Although, there exist many feature selection methods, however, these methods face challenges such as scalability, managing high-dimensional data, dealing with correlated features, adapting to variable feature importance, and integrating domain knowledge. To this end, we introduce the "Adaptive Feature Selection with Binary Masking" (AFS-BM) which remedies these problems. AFS-BM achieves this by joint optimization for simultaneous feature selection and model training. In particular, we do the joint optimization and binary masking to continuously adapt the set of features and model parameters during the training process. This approach leads to significant improvements in model accuracy and a reduction in computational requirements. We provide an extensive set of experiments where we compare AFS-BM with the established feature selection methods using well-known datasets from real-life competitions. Our results show that AFS-BM makes significant improvement in terms of accuracy and requires significantly less computational complexity. This is due to AFS-BM's ability to dynamically adjust to the changing importance of features during the training process, which an important contribution to the field. We openly share our code for the replicability of our results and to facilitate further research.
△ Less
Submitted 17 June, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
Hierarchical Ensemble-Based Feature Selection for Time Series Forecasting
Authors:
Aysin Tumay,
Mustafa E. Aydin,
Ali T. Koc,
Suleyman S. Kozat
Abstract:
We introduce a novel ensemble approach for feature selection based on hierarchical stacking for non-stationarity and/or a limited number of samples with a large number of features. Our approach exploits the co-dependency between features using a hierarchical structure. Initially, a machine learning model is trained using a subset of features, and then the output of the model is updated using other…
▽ More
We introduce a novel ensemble approach for feature selection based on hierarchical stacking for non-stationarity and/or a limited number of samples with a large number of features. Our approach exploits the co-dependency between features using a hierarchical structure. Initially, a machine learning model is trained using a subset of features, and then the output of the model is updated using other algorithms in a hierarchical manner with the remaining features to minimize the target loss. This hierarchical structure allows for flexible depth and feature selection. By exploiting feature co-dependency hierarchically, our proposed approach overcomes the limitations of traditional feature selection methods and feature importance scores. The effectiveness of the approach is demonstrated on synthetic and well-known real-life datasets, providing significant scalable and stable performance improvements compared to the traditional methods and the state-of-the-art approaches. We also provide the source code of our approach to facilitate further research and replicability of our results.
△ Less
Submitted 21 January, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Hybrid State Space-based Learning for Sequential Data Prediction with Joint Optimization
Authors:
Mustafa E. Aydın,
Arda Fazla,
Suleyman S. Kozat
Abstract:
We investigate nonlinear prediction/regression in an online setting and introduce a hybrid model that effectively mitigates, via a joint mechanism through a state space formulation, the need for domain-specific feature engineering issues of conventional nonlinear prediction models and achieves an efficient mix of nonlinear and linear components. In particular, we use recursive structures to extrac…
▽ More
We investigate nonlinear prediction/regression in an online setting and introduce a hybrid model that effectively mitigates, via a joint mechanism through a state space formulation, the need for domain-specific feature engineering issues of conventional nonlinear prediction models and achieves an efficient mix of nonlinear and linear components. In particular, we use recursive structures to extract features from raw sequential sequences and a traditional linear time series model to deal with the intricacies of the sequential data, e.g., seasonality, trends. The state-of-the-art ensemble or hybrid models typically train the base models in a disjoint manner, which is not only time consuming but also sub-optimal due to the separation of modeling or independent training. In contrast, as the first time in the literature, we jointly optimize an enhanced recurrent neural network (LSTM) for automatic feature extraction from raw data and an ARMA-family time series model (SARIMAX) for effectively addressing peculiarities associated with time series data. We achieve this by introducing novel state space representations for the base models, which are then combined to provide a full state space representation of the hybrid or the ensemble. Hence, we are able to jointly optimize both models in a single pass via particle filtering, for which we also provide the update equations. The introduced architecture is generic so that one can use other recurrent architectures, e.g., GRUs, traditional time series-specific models, e.g., ETS or other optimization methods, e.g., EKF, UKF. Due to such novel combination and joint optimization, we demonstrate significant improvements in widely publicized real life competition datasets. We also openly share our code for further research and replicability of our results.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Context-Aware Ensemble Learning for Time Series
Authors:
Arda Fazla,
Mustafa Enes Aydin,
Orhun Tamyigit,
Suleyman Serdar Kozat
Abstract:
We investigate ensemble methods for prediction in an online setting. Unlike all the literature in ensembling, for the first time, we introduce a new approach using a meta learner that effectively combines the base model predictions via using a superset of the features that is the union of the base models' feature vectors instead of the predictions themselves. Here, our model does not use the predi…
▽ More
We investigate ensemble methods for prediction in an online setting. Unlike all the literature in ensembling, for the first time, we introduce a new approach using a meta learner that effectively combines the base model predictions via using a superset of the features that is the union of the base models' feature vectors instead of the predictions themselves. Here, our model does not use the predictions of the base models as inputs to a machine learning algorithm, but choose the best possible combination at each time step based on the state of the problem. We explore three different constraint spaces for the ensembling of the base learners that linearly combines the base predictions, which are convex combinations where the components of the ensembling vector are all nonnegative and sum up to 1; affine combinations where the weight vector components are required to sum up to 1; and the unconstrained combinations where the components are free to take any real value. The constraints are both theoretically analyzed under known statistics and integrated into the learning procedure of the meta learner as a part of the optimization in an automated manner. To show the practical efficiency of the proposed method, we employ a gradient-boosted decision tree and a multi-layer perceptron separately as the meta learners. Our framework is generic so that one can use other machine learning architectures as the ensembler as long as they allow for a custom differentiable loss for minimization. We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets, extensively used in the well-known data competitions. Furthermore, we openly share the source code of the proposed method to facilitate further research and comparison.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Deep Reinforcement Learning Based Joint Downlink Beamforming and RIS Configuration in RIS-aided MU-MISO Systems Under Hardware Impairments and Imperfect CSI
Authors:
Baturay Saglam,
Doga Gurgunoglu,
Suleyman S. Kozat
Abstract:
We introduce a novel deep reinforcement learning (DRL) approach to jointly optimize transmit beamforming and reconfigurable intelligent surface (RIS) phase shifts in a multiuser multiple input single output (MU-MISO) system to maximize the sum downlink rate under the phase-dependent reflection amplitude model. Our approach addresses the challenge of imperfect channel state information (CSI) and ha…
▽ More
We introduce a novel deep reinforcement learning (DRL) approach to jointly optimize transmit beamforming and reconfigurable intelligent surface (RIS) phase shifts in a multiuser multiple input single output (MU-MISO) system to maximize the sum downlink rate under the phase-dependent reflection amplitude model. Our approach addresses the challenge of imperfect channel state information (CSI) and hardware impairments by considering a practical RIS amplitude model. We compare the performance of our approach against a vanilla DRL agent in two scenarios: perfect CSI and phase-dependent RIS amplitudes, and mismatched CSI and ideal RIS reflections. The results demonstrate that the proposed framework significantly outperforms the vanilla DRL agent under mismatch and approaches the golden standard. Our contributions include modifications to the DRL approach to address the joint design of transmit beamforming and phase shifts and the phase-dependent amplitude model. To the best of our knowledge, our method is the first DRL-based approach for the phase-dependent reflection amplitude model in RIS-aided MU-MISO systems. Our findings in this study highlight the potential of our approach as a promising solution to overcome hardware impairments in RIS-aided wireless communication systems.
△ Less
Submitted 29 March, 2023; v1 submitted 10 October, 2022;
originally announced November 2022.
-
Deep Intrinsically Motivated Exploration in Continuous Control
Authors:
Baturay Saglam,
Suleyman S. Kozat
Abstract:
In continuous control, exploration is often performed through undirected strategies in which parameters of the networks or selected actions are perturbed by random noise. Although the deep setting of undirected exploration has been shown to improve the performance of on-policy methods, they introduce an excessive computational complexity and are known to fail in the off-policy setting. The intrins…
▽ More
In continuous control, exploration is often performed through undirected strategies in which parameters of the networks or selected actions are perturbed by random noise. Although the deep setting of undirected exploration has been shown to improve the performance of on-policy methods, they introduce an excessive computational complexity and are known to fail in the off-policy setting. The intrinsically motivated exploration is an effective alternative to the undirected strategies, but they are usually studied for discrete action domains. In this paper, we investigate how intrinsic motivation can effectively be combined with deep reinforcement learning in the control of continuous systems to obtain a directed exploratory behavior. We adapt the existing theories on animal motivational systems into the reinforcement learning paradigm and introduce a novel and scalable directed exploration strategy. The introduced approach, motivated by the maximization of the value function's error, can benefit from a collected set of experiences by extracting useful information and unify the intrinsic exploration motivations in the literature under a single exploration objective. An extensive set of empirical studies demonstrate that our framework extends to larger and more diverse state spaces, dramatically improves the baselines, and outperforms the undirected strategies significantly.
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Actor Prioritized Experience Replay
Authors:
Baturay Saglam,
Furkan B. Mutlu,
Dogan C. Cicek,
Suleyman S. Kozat
Abstract:
A widely-studied deep reinforcement learning (RL) technique known as Prioritized Experience Replay (PER) allows agents to learn from transitions sampled with non-uniform probability proportional to their temporal-difference (TD) error. Although it has been shown that PER is one of the most crucial components for the overall performance of deep RL methods in discrete action domains, many empirical…
▽ More
A widely-studied deep reinforcement learning (RL) technique known as Prioritized Experience Replay (PER) allows agents to learn from transitions sampled with non-uniform probability proportional to their temporal-difference (TD) error. Although it has been shown that PER is one of the most crucial components for the overall performance of deep RL methods in discrete action domains, many empirical studies indicate that it considerably underperforms actor-critic algorithms in continuous control. We theoretically show that actor networks cannot be effectively trained with transitions that have large TD errors. As a result, the approximate policy gradient computed under the Q-network diverges from the actual gradient computed under the optimal Q-function. Motivated by this, we introduce a novel experience replay sampling framework for actor-critic methods, which also regards issues with stability and recent findings behind the poor empirical performance of PER. The introduced algorithm suggests a new branch of improvements to PER and schedules effective and efficient training for both actor and critic networks. An extensive set of experiments verifies our theoretical claims and demonstrates that the introduced method significantly outperforms the competing approaches and obtains state-of-the-art results over the standard off-policy actor-critic algorithms.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Optimal Tracking in Prediction with Expert Advice
Authors:
Hakan Gokcesu,
Suleyman S. Kozat
Abstract:
We study the prediction with expert advice setting, where the aim is to produce a decision by combining the decisions generated by a set of experts, e.g., independently running algorithms. We achieve the min-max optimal dynamic regret under the prediction with expert advice setting, i.e., we can compete against time-varying (not necessarily fixed) combinations of expert decisions in an optimal man…
▽ More
We study the prediction with expert advice setting, where the aim is to produce a decision by combining the decisions generated by a set of experts, e.g., independently running algorithms. We achieve the min-max optimal dynamic regret under the prediction with expert advice setting, i.e., we can compete against time-varying (not necessarily fixed) combinations of expert decisions in an optimal manner. Our end-algorithm is truly online with no prior information, such as the time horizon or loss ranges, which are commonly used by different algorithms in the literature. Both our regret guarantees and the min-max lower bounds are derived with the general consideration that the expert losses can have time-varying properties and are possibly unbounded. Our algorithm can be adapted for restrictive scenarios regarding both loss feedback and decision making. Our guarantees are universal, i.e., our end-algorithm can provide regret guarantee against any competitor sequence in a min-max optimal manner with logarithmic complexity. Note that, to our knowledge, for the prediction with expert advice problem, our algorithms are the first to produce such universally optimal, adaptive and truly online guarantees with no prior knowledge.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach
Authors:
Baturay Saglam,
Dogan C. Cicek,
Furkan B. Mutlu,
Suleyman S. Kozat
Abstract:
Compared to on-policy counterparts, off-policy model-free deep reinforcement learning can improve data efficiency by repeatedly using the previously gathered data. However, off-policy learning becomes challenging when the discrepancy between the underlying distributions of the agent's policy and collected data increases. Although the well-studied importance sampling and off-policy policy gradient…
▽ More
Compared to on-policy counterparts, off-policy model-free deep reinforcement learning can improve data efficiency by repeatedly using the previously gathered data. However, off-policy learning becomes challenging when the discrepancy between the underlying distributions of the agent's policy and collected data increases. Although the well-studied importance sampling and off-policy policy gradient techniques were proposed to compensate for this discrepancy, they usually require a collection of long trajectories and induce additional problems such as vanishing/exploding gradients or discarding many useful experiences, which eventually increases the computational complexity. Moreover, their generalization to either continuous action domains or policies approximated by deterministic deep neural networks is strictly limited. To overcome these limitations, we introduce a novel policy similarity measure to mitigate the effects of such discrepancy in continuous control. Our method offers an adequate single-step off-policy correction that is applicable to deterministic policy networks. Theoretical and empirical studies demonstrate that it can achieve a "safe" off-policy learning and substantially improve the state-of-the-art by attaining higher returns in fewer steps than the competing methods through an effective schedule of the learning rate in Q-learning and policy optimization.
△ Less
Submitted 25 September, 2023; v1 submitted 1 August, 2022;
originally announced August 2022.
-
Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms
Authors:
Baturay Saglam,
Dogan C. Cicek,
Furkan B. Mutlu,
Suleyman S. Kozat
Abstract:
Learning in high dimensional continuous tasks is challenging, mainly when the experience replay memory is very limited. We introduce a simple yet effective experience sharing mechanism for deterministic policies in continuous action domains for the future off-policy deep reinforcement learning applications in which the allocated memory for the experience replay buffer is limited. To overcome the e…
▽ More
Learning in high dimensional continuous tasks is challenging, mainly when the experience replay memory is very limited. We introduce a simple yet effective experience sharing mechanism for deterministic policies in continuous action domains for the future off-policy deep reinforcement learning applications in which the allocated memory for the experience replay buffer is limited. To overcome the extrapolation error induced by learning from other agents' experiences, we facilitate our algorithm with a novel off-policy correction technique without any action probability estimates. We test the effectiveness of our method in challenging OpenAI Gym continuous control tasks and conclude that it can achieve a safe experience sharing across multiple agents and exhibits a robust performance when the replay memory is strictly limited.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
A Hybrid Framework for Sequential Data Prediction with End-to-End Optimization
Authors:
Mustafa E. Aydın,
Suleyman S. Kozat
Abstract:
We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates, via an end-to-end architecture, the need for hand-designed features and manual model selection issues of conventional nonlinear prediction/regression methods. In particular, we use recursive structures to extract features from sequential signals, while preserving the state information,…
▽ More
We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates, via an end-to-end architecture, the need for hand-designed features and manual model selection issues of conventional nonlinear prediction/regression methods. In particular, we use recursive structures to extract features from sequential signals, while preserving the state information, i.e., the history, and boosted decision trees to produce the final output. The connection is in an end-to-end fashion and we jointly optimize the whole architecture using stochastic gradient descent, for which we also provide the backward pass update equations. In particular, we employ a recurrent neural network (LSTM) for adaptive feature extraction from sequential data and a gradient boosting machinery (soft GBDT) for effective supervised regression. Our framework is generic so that one can use other deep learning architectures for feature extraction (such as RNNs and GRUs) and machine learning algorithms for decision making as long as they are differentiable. We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets. Furthermore, we openly share the source code of the proposed method to facilitate further research.
△ Less
Submitted 4 August, 2022; v1 submitted 25 March, 2022;
originally announced March 2022.
-
Crime Prediction with Graph Neural Networks and Multivariate Normal Distributions
Authors:
Selim Furkan Tekin,
Suleyman Serdar Kozat
Abstract:
Existing approaches to the crime prediction problem are unsuccessful in expressing the details since they assign the probability values to large regions. This paper introduces a new architecture with the graph convolutional networks (GCN) and multivariate Gaussian distributions to perform high-resolution forecasting that applies to any spatiotemporal data. We tackle the sparsity problem in high re…
▽ More
Existing approaches to the crime prediction problem are unsuccessful in expressing the details since they assign the probability values to large regions. This paper introduces a new architecture with the graph convolutional networks (GCN) and multivariate Gaussian distributions to perform high-resolution forecasting that applies to any spatiotemporal data. We tackle the sparsity problem in high resolution by leveraging the flexible structure of GCNs and providing a subdivision algorithm. We build our model with Graph Convolutional Gated Recurrent Units (Graph-ConvGRU) to learn spatial, temporal, and categorical relations. In each node of the graph, we learn a multivariate probability distribution from the extracted features of GCNs. We perform experiments on real-life and synthetic datasets, and our model obtains the best validation and the best test score among the baseline models with significant improvements. We show that our model is not only generative but also precise.
△ Less
Submitted 16 December, 2021; v1 submitted 29 November, 2021;
originally announced November 2021.
-
AWD3: Dynamic Reduction of the Estimation Bias
Authors:
Dogan C. Cicek,
Enes Duran,
Baturay Saglam,
Kagan Kaya,
Furkan B. Mutlu,
Suleyman S. Kozat
Abstract:
Value-based deep Reinforcement Learning (RL) algorithms suffer from the estimation bias primarily caused by function approximation and temporal difference (TD) learning. This problem induces faulty state-action value estimates and therefore harms the performance and robustness of the learning algorithms. Although several techniques were proposed to tackle, learning algorithms still suffer from thi…
▽ More
Value-based deep Reinforcement Learning (RL) algorithms suffer from the estimation bias primarily caused by function approximation and temporal difference (TD) learning. This problem induces faulty state-action value estimates and therefore harms the performance and robustness of the learning algorithms. Although several techniques were proposed to tackle, learning algorithms still suffer from this bias. Here, we introduce a technique that eliminates the estimation bias in off-policy continuous control algorithms using the experience replay mechanism. We adaptively learn the weighting hyper-parameter beta in the Weighted Twin Delayed Deep Deterministic Policy Gradient algorithm. Our method is named Adaptive-WD3 (AWD3). We show through continuous control environments of OpenAI gym that our algorithm matches or outperforms the state-of-the-art off-policy policy gradient learning algorithms.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay
Authors:
Dogan C. Cicek,
Enes Duran,
Baturay Saglam,
Furkan B. Mutlu,
Suleyman S. Kozat
Abstract:
The experience replay mechanism allows agents to use the experiences multiple times. In prior works, the sampling probability of the transitions was adjusted according to their importance. Reassigning sampling probabilities for every transition in the replay buffer after each iteration is highly inefficient. Therefore, experience replay prioritization algorithms recalculate the significance of a t…
▽ More
The experience replay mechanism allows agents to use the experiences multiple times. In prior works, the sampling probability of the transitions was adjusted according to their importance. Reassigning sampling probabilities for every transition in the replay buffer after each iteration is highly inefficient. Therefore, experience replay prioritization algorithms recalculate the significance of a transition when the corresponding transition is sampled to gain computational efficiency. However, the importance level of the transitions changes dynamically as the policy and the value function of the agent are updated. In addition, experience replay stores the transitions are generated by the previous policies of the agent that may significantly deviate from the most recent policy of the agent. Higher deviation from the most recent policy of the agent leads to more off-policy updates, which is detrimental for the agent. In this paper, we develop a novel algorithm, Batch Prioritizing Experience Replay via KL Divergence (KLPER), which prioritizes batch of transitions rather than directly prioritizing each transition. Moreover, to reduce the off-policyness of the updates, our algorithm selects one batch among a certain number of batches and forces the agent to learn through the batch that is most likely generated by the most recent policy of the agent. We combine our algorithm with Deep Deterministic Policy Gradient and Twin Delayed Deep Deterministic Policy Gradient and evaluate it on various continuous control tasks. KLPER provides promising improvements for deep deterministic continuous control algorithms in terms of sample efficiency, final performance, and stability of the policy during the training.
△ Less
Submitted 12 November, 2021; v1 submitted 2 November, 2021;
originally announced November 2021.
-
Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients
Authors:
Baturay Saglam,
Furkan Burak Mutlu,
Dogan Can Cicek,
Suleyman Serdar Kozat
Abstract:
Approximation of the value functions in value-based deep reinforcement learning induces overestimation bias, resulting in suboptimal policies. We show that when the reinforcement signals received by the agents have a high variance, deep actor-critic approaches that overcome the overestimation bias lead to a substantial underestimation bias. We first address the detrimental issues in the existing a…
▽ More
Approximation of the value functions in value-based deep reinforcement learning induces overestimation bias, resulting in suboptimal policies. We show that when the reinforcement signals received by the agents have a high variance, deep actor-critic approaches that overcome the overestimation bias lead to a substantial underestimation bias. We first address the detrimental issues in the existing approaches that aim to overcome such underestimation error. Then, through extensive statistical analysis, we introduce a novel, parameter-free Deep Q-learning variant to reduce this underestimation bias in deterministic policy gradients. By sampling the weights of a linear combination of two approximate critics from a highly shrunk estimation bias interval, our Q-value update rule is not affected by the variance of the rewards received by the agents throughout learning. We test the performance of the introduced improvement on a set of MuJoCo and Box2D continuous control tasks and demonstrate that it considerably outperforms the existing approaches and improves the state-of-the-art by a significant margin.
△ Less
Submitted 19 May, 2022; v1 submitted 24 September, 2021;
originally announced September 2021.
-
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods
Authors:
Baturay Saglam,
Enes Duran,
Dogan C. Cicek,
Furkan B. Mutlu,
Suleyman S. Kozat
Abstract:
In value-based deep reinforcement learning methods, approximation of value functions induces overestimation bias and leads to suboptimal policies. We show that in deep actor-critic methods that aim to overcome the overestimation bias, if the reinforcement signals received by the agent have a high variance, a significant underestimation bias arises. To minimize the underestimation, we introduce a p…
▽ More
In value-based deep reinforcement learning methods, approximation of value functions induces overestimation bias and leads to suboptimal policies. We show that in deep actor-critic methods that aim to overcome the overestimation bias, if the reinforcement signals received by the agent have a high variance, a significant underestimation bias arises. To minimize the underestimation, we introduce a parameter-free, novel deep Q-learning variant. Our Q-value update rule combines the notions behind Clipped Double Q-learning and Maxmin Q-learning by computing the critic objective through the nested combination of maximum and minimum operators to bound the approximate value estimates. We evaluate our modification on the suite of several OpenAI Gym continuous control tasks, improving the state-of-the-art in every environment tested.
△ Less
Submitted 23 September, 2021; v1 submitted 22 September, 2021;
originally announced September 2021.
-
Numerical Weather Forecasting using Convolutional-LSTM with Attention and Context Matcher Mechanisms
Authors:
Selim Furkan Tekin,
Arda Fazla,
Suleyman Serdar Kozat
Abstract:
Numerical weather forecasting using high-resolution physical models often requires extensive computational resources on supercomputers, which diminishes their wide usage in most real-life applications. As a remedy, applying deep learning methods has revealed innovative solutions within this field. To this end, we introduce a novel deep learning architecture for forecasting high-resolution spatio-t…
▽ More
Numerical weather forecasting using high-resolution physical models often requires extensive computational resources on supercomputers, which diminishes their wide usage in most real-life applications. As a remedy, applying deep learning methods has revealed innovative solutions within this field. To this end, we introduce a novel deep learning architecture for forecasting high-resolution spatio-temporal weather data. Our approach extends the conventional encoder-decoder structure by integrating Convolutional Long-short Term Memory and Convolutional Neural Networks. In addition, we incorporate attention and context matcher mechanisms into the model architecture. Our Weather Model achieves significant performance improvements compared to baseline deep learning models, including ConvLSTM, TrajGRU, and U-Net. Our experimental evaluation involves high-scale, real-world benchmark numerical weather datasets, namely the ERA5 hourly dataset on pressure levels and WeatherBench. Our results demonstrate substantial improvements in identifying spatial and temporal correlations with attention matrices focusing on distinct parts of the input series to model atmospheric circulations. We also compare our model with high-resolution physical models using the benchmark metrics and show that our Weather Model is accurate and easy to interpret.
△ Less
Submitted 4 October, 2023; v1 submitted 1 February, 2021;
originally announced February 2021.
-
PySAD: A Streaming Anomaly Detection Framework in Python
Authors:
Selim F. Yilmaz,
Suleyman S. Kozat
Abstract:
PySAD is an open-source python framework for anomaly detection on streaming data. PySAD serves various state-of-the-art methods for streaming anomaly detection. The framework provides a complete set of tools to design anomaly detection experiments ranging from projectors to probability calibrators. PySAD builds upon popular open-source frameworks such as PyOD and scikit-learn. We enforce software…
▽ More
PySAD is an open-source python framework for anomaly detection on streaming data. PySAD serves various state-of-the-art methods for streaming anomaly detection. The framework provides a complete set of tools to design anomaly detection experiments ranging from projectors to probability calibrators. PySAD builds upon popular open-source frameworks such as PyOD and scikit-learn. We enforce software quality by enforcing compliance with PEP8 guidelines, functional testing and using continuous integration. The source code is publicly available on https://github.com/selimfirat/pysad.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.
-
Multi-Label Sentiment Analysis on 100 Languages with Dynamic Weighting for Label Imbalance
Authors:
Selim F. Yilmaz,
E. Batuhan Kaynak,
Aykut Koç,
Hamdi Dibeklioğlu,
Suleyman S. Kozat
Abstract:
We investigate cross-lingual sentiment analysis, which has attracted significant attention due to its applications in various areas including market research, politics and social sciences. In particular, we introduce a sentiment analysis framework in multi-label setting as it obeys Plutchik wheel of emotions. We introduce a novel dynamic weighting method that balances the contribution from each cl…
▽ More
We investigate cross-lingual sentiment analysis, which has attracted significant attention due to its applications in various areas including market research, politics and social sciences. In particular, we introduce a sentiment analysis framework in multi-label setting as it obeys Plutchik wheel of emotions. We introduce a novel dynamic weighting method that balances the contribution from each class during training, unlike previous static weighting methods that assign non-changing weights based on their class frequency. Moreover, we adapt the focal loss that favors harder instances from single-label object recognition literature to our multi-label setting. Furthermore, we derive a method to choose optimal class-specific thresholds that maximize the macro-f1 score in linear time complexity. Through an extensive set of experiments, we show that our method obtains the state-of-the-art performance in 7 of 9 metrics in 3 different languages using a single model compared to the common baselines and the best-performing methods in the SemEval competition. We publicly share our code for our model, which can perform sentiment analysis in 100 languages, to facilitate further research.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Spatio-temporal Sequence Prediction with Point Processes and Self-organizing Decision Trees
Authors:
Oguzhan Karaahmetoglu,
Suleyman S. Kozat
Abstract:
We study the spatio-temporal prediction problem and introduce a novel point-process-based prediction algorithm. Spatio-temporal prediction is extensively studied in Machine Learning literature due to its critical real-life applications such as crime, earthquake, and social event prediction. Despite these thorough studies, specific problems inherent to the application domain are not yet fully explo…
▽ More
We study the spatio-temporal prediction problem and introduce a novel point-process-based prediction algorithm. Spatio-temporal prediction is extensively studied in Machine Learning literature due to its critical real-life applications such as crime, earthquake, and social event prediction. Despite these thorough studies, specific problems inherent to the application domain are not yet fully explored. Here, we address the non-stationary spatio-temporal prediction problem on both densely and sparsely distributed sequences. We introduce a probabilistic approach that partitions the spatial domain into subregions and models the event arrivals in each region with interacting point-processes. Our algorithm can jointly learn the spatial partitioning and the interaction between these regions through a gradient-based optimization procedure. Finally, we demonstrate the performance of our algorithm on both simulated data and two real-life datasets. We compare our approach with baseline and state-of-the-art deep learning-based approaches, where we achieve significant performance improvements. Moreover, we also show the effect of using different parameters on the overall performance through empirical results and explain the procedure for choosing the parameters.
△ Less
Submitted 15 March, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Markovian RNN: An Adaptive Time Series Prediction Network with HMM-based Switching for Nonstationary Environments
Authors:
Fatih Ilhan,
Oguzhan Karaahmetoglu,
Ismail Balaban,
Suleyman Serdar Kozat
Abstract:
We investigate nonlinear regression for nonstationary sequential data. In most real-life applications such as business domains including finance, retail, energy and economy, timeseries data exhibits nonstationarity due to the temporally varying dynamics of the underlying system. We introduce a novel recurrent neural network (RNN) architecture, which adaptively switches between internal regimes in…
▽ More
We investigate nonlinear regression for nonstationary sequential data. In most real-life applications such as business domains including finance, retail, energy and economy, timeseries data exhibits nonstationarity due to the temporally varying dynamics of the underlying system. We introduce a novel recurrent neural network (RNN) architecture, which adaptively switches between internal regimes in a Markovian way to model the nonstationary nature of the given data. Our model, Markovian RNN employs a hidden Markov model (HMM) for regime transitions, where each regime controls hidden state transitions of the recurrent cell independently. We jointly optimize the whole network in an end-to-end fashion. We demonstrate the significant performance gains compared to vanilla RNN and conventional methods such as Markov Switching ARIMA through an extensive set of experiments with synthetic and real-life datasets. We also interpret the inferred parameters and regime belief values to analyze the underlying dynamics of the given sequences.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
Unsupervised Online Anomaly Detection On Irregularly Sampled Or Missing Valued Time-Series Data Using LSTM Networks
Authors:
Oguzhan Karaahmetoglu,
Fatih Ilhan,
Ismail Balaban,
Suleyman Serdar Kozat
Abstract:
We study anomaly detection and introduce an algorithm that processes variable length, irregularly sampled sequences or sequences with missing values. Our algorithm is fully unsupervised, however, can be readily extended to supervised or semisupervised cases when the anomaly labels are present as remarked throughout the paper. Our approach uses the Long Short Term Memory (LSTM) networks in order to…
▽ More
We study anomaly detection and introduce an algorithm that processes variable length, irregularly sampled sequences or sequences with missing values. Our algorithm is fully unsupervised, however, can be readily extended to supervised or semisupervised cases when the anomaly labels are present as remarked throughout the paper. Our approach uses the Long Short Term Memory (LSTM) networks in order to extract temporal features and find the most relevant feature vectors for anomaly detection. We incorporate the sampling time information to our model by modulating the standard LSTM model with time modulation gates. After obtaining the most relevant features from the LSTM, we label the sequences using a Support Vector Data Descriptor (SVDD) model. We introduce a loss function and then jointly optimize the feature extraction and sequence processing mechanisms in an end-to-end manner. Through this joint optimization, the LSTM extracts the most relevant features for anomaly detection later to be used in the SVDD, hence completely removes the need for feature selection by expert knowledge. Furthermore, we provide a training algorithm for the online setup, where we optimize our model parameters with individual sequences as the new data arrives. Finally, on real-life datasets, we show that our model significantly outperforms the standard approaches thanks to its combination of LSTM with SVDD and joint optimization.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.
-
A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data
Authors:
S. Onur Sahin,
Suleyman S. Kozat
Abstract:
We investigate regression for variable length sequential data containing missing samples and introduce a novel tree architecture based on the Long Short-Term Memory (LSTM) networks. In our architecture, we employ a variable number of LSTM networks, which use only the existing inputs in the sequence, in a tree-like architecture without any statistical assumptions or imputations on the missing data,…
▽ More
We investigate regression for variable length sequential data containing missing samples and introduce a novel tree architecture based on the Long Short-Term Memory (LSTM) networks. In our architecture, we employ a variable number of LSTM networks, which use only the existing inputs in the sequence, in a tree-like architecture without any statistical assumptions or imputations on the missing data, unlike all the previous approaches. In particular, we incorporate the missingness information by selecting a subset of these LSTM networks based on "presence-pattern" of a certain number of previous inputs. From the mixture of experts perspective, we train different LSTM networks as our experts for various missingness patterns and then combine their outputs to generate the final prediction. We also provide the computational complexity analysis of the proposed architecture, which is in the same order of the complexity of the conventional LSTM architectures for the sequence length. Our method can be readily extended to similar structures such as GRUs, RNNs as remarked in the paper. In the experiments, we achieve significant performance improvements with respect to the state-of-the-art methods for the well-known financial and real life datasets.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Achieving Online Regression Performance of LSTMs with Simple RNNs
Authors:
N. Mert Vural,
Fatih Ilhan,
Selim F. Yilmaz,
Salih Ergüt,
Suleyman S. Kozat
Abstract:
Recurrent Neural Networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, Long-Short-Term-Memory Networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, traini…
▽ More
Recurrent Neural Networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, Long-Short-Term-Memory Networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, training LSTMs requires considerably longer training time compared to simple RNNs (SRNNs). In this paper, we achieve the online regression performance of LSTMs with SRNNs efficiently. To this end, we introduce a first-order training algorithm with a linear time complexity in the number of parameters. We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time. We provide strong theoretical analysis to support our experimental results by providing regret bounds on the convergence rate of our algorithm. Through an extensive set of experiments, we verify our theoretical work and demonstrate significant performance improvements of our algorithm with respect to LSTMs and the other state-of-the-art learning models.
△ Less
Submitted 31 May, 2021; v1 submitted 16 May, 2020;
originally announced May 2020.
-
Unsupervised Anomaly Detection via Deep Metric Learning with End-to-End Optimization
Authors:
Selim F. Yilmaz,
Suleyman S. Kozat
Abstract:
We investigate unsupervised anomaly detection for high-dimensional data and introduce a deep metric learning (DML) based framework. In particular, we learn a distance metric through a deep neural network. Through this metric, we project the data into the metric space that better separates the anomalies from the normal data and reduces the effect of the curse of dimensionality for high-dimensional…
▽ More
We investigate unsupervised anomaly detection for high-dimensional data and introduce a deep metric learning (DML) based framework. In particular, we learn a distance metric through a deep neural network. Through this metric, we project the data into the metric space that better separates the anomalies from the normal data and reduces the effect of the curse of dimensionality for high-dimensional data. We present a novel data distillation method through self-supervision to remedy the conventional practice of assuming all data as normal. We also employ the hard mining technique from the DML literature. We show these components improve the performance of our model and significantly reduce the running time. Through an extensive set of experiments on the 14 real-world datasets, our method demonstrates significant performance gains compared to the state-of-the-art unsupervised anomaly detection methods, e.g., an absolute improvement between 4.44% and 11.74% on the average over the 14 datasets. Furthermore, we share the source code of our method on Github to facilitate further research.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Modeling of Spatio-Temporal Hawkes Processes with Randomized Kernels
Authors:
Fatih Ilhan,
Suleyman Serdar Kozat
Abstract:
We investigate spatio-temporal event analysis using point processes. Inferring the dynamics of event sequences spatiotemporally has many practical applications including crime prediction, social media analysis, and traffic forecasting. In particular, we focus on spatio-temporal Hawkes processes that are commonly used due to their capability to capture excitations between event occurrences. We intr…
▽ More
We investigate spatio-temporal event analysis using point processes. Inferring the dynamics of event sequences spatiotemporally has many practical applications including crime prediction, social media analysis, and traffic forecasting. In particular, we focus on spatio-temporal Hawkes processes that are commonly used due to their capability to capture excitations between event occurrences. We introduce a novel inference framework based on randomized transformations and gradient descent to learn the process. We replace the spatial kernel calculations by randomized Fourier feature-based transformations. The introduced randomization by this representation provides flexibility while modeling the spatial excitation between events. Moreover, the system described by the process is expressed within closed-form in terms of scalable matrix operations. During the optimization, we use maximum likelihood estimation approach and gradient descent while properly handling positivity and orthonormality constraints. The experiment results show the improvements achieved by the introduced method in terms of fitting capability in synthetic and real datasets with respect to the conventional inference methods in the spatio-temporal Hawkes process literature. We also analyze the triggering interactions between event types and how their dynamics change in space and time through the interpretation of learned parameters.
△ Less
Submitted 15 February, 2021; v1 submitted 7 March, 2020;
originally announced March 2020.
-
Prediction with Spatio-temporal Point Processes with Self Organizing Decision Trees
Authors:
Oguzhan Karaahmetoglu,
Suleyman Serdar Kozat
Abstract:
We study the spatio-temporal prediction problem, which has attracted the attention of many researchers due to its critical real-life applications. In particular, we introduce a novel approach to this problem. Our approach is based on the Hawkes process, which is a non-stationary and self-exciting point process. We extend the formulations of a standard point process model that can represent time-se…
▽ More
We study the spatio-temporal prediction problem, which has attracted the attention of many researchers due to its critical real-life applications. In particular, we introduce a novel approach to this problem. Our approach is based on the Hawkes process, which is a non-stationary and self-exciting point process. We extend the formulations of a standard point process model that can represent time-series data to represent a spatio-temporal data. We model the data as nonstationary in time and space. Furthermore, we partition the spatial region we are working on into subregions via an adaptive decision tree and model the source statistics in each subregion with individual but mutually interacting point processes. We also provide a gradient based joint optimization algorithm for the point process and decision tree parameters. Thus, we introduce a model that can jointly infer the source statistics and an adaptive partitioning of the spatial region. Finally, we provide experimental results on real-life data, which provides significant improvement due to space adaptation and joint optimization compared to standard well-known methods in the literature.
△ Less
Submitted 5 July, 2020; v1 submitted 7 March, 2020;
originally announced March 2020.
-
RNN-based Online Learning: An Efficient First-Order Optimization Algorithm with a Convergence Guarantee
Authors:
N. Mert Vural,
Selim F. Yilmaz,
Fatih Ilhan,
Suleyman S. Kozat
Abstract:
We investigate online nonlinear regression with continually running recurrent neural network networks (RNNs), i.e., RNN-based online learning. For RNN-based online learning, we introduce an efficient first-order training algorithm that theoretically guarantees to converge to the optimum network parameters. Our algorithm is truly online such that it does not make any assumption on the learning envi…
▽ More
We investigate online nonlinear regression with continually running recurrent neural network networks (RNNs), i.e., RNN-based online learning. For RNN-based online learning, we introduce an efficient first-order training algorithm that theoretically guarantees to converge to the optimum network parameters. Our algorithm is truly online such that it does not make any assumption on the learning environment to guarantee convergence. Through numerical simulations, we verify our theoretical results and illustrate significant performance improvements achieved by our algorithm with respect to the state-of-the-art RNN training methods.
△ Less
Submitted 31 May, 2021; v1 submitted 7 March, 2020;
originally announced March 2020.
-
Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning
Authors:
Nuri Mert Vural,
Fatih Ilhan,
Suleyman S. Kozat
Abstract:
We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations -- introduced due t…
▽ More
We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations -- introduced due to decoupling -- stay bounded, DEKF learns LSTM parameters with similar convergence and stability properties of the global extended Kalman filter learning algorithm. We verify our results with several numerical simulations and compare DEKF with other LSTM training methods. In our simulations, we also observe that the well-known hyper-parameter selection approaches used for DEKF in the literature satisfy our conditions.
△ Less
Submitted 31 May, 2021; v1 submitted 25 November, 2019;
originally announced November 2019.
-
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays
Authors:
N. Mert Vural,
Hakan Gokcesu,
Kaan Gokcesu,
Suleyman S. Kozat
Abstract:
We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal regret bounds. To construct our algorithm, we introduce a new expert advice algorithm for the multiple-play setting. By using our expert advice algorithm, we a…
▽ More
We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal regret bounds. To construct our algorithm, we introduce a new expert advice algorithm for the multiple-play setting. By using our expert advice algorithm, we additionally improve the best-known high-probability bound for the multi-play setting by $O(\sqrt{m})$. Our results are guaranteed to hold in an individual sequence manner since we have no statistical assumption on the bandit arm gains. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art algorithms.
△ Less
Submitted 25 November, 2019;
originally announced November 2019.
-
An Efficient and Effective Second-Order Training Algorithm for LSTM-based Adaptive Learning
Authors:
N. Mert Vural,
Salih Ergüt,
Suleyman S. Kozat
Abstract:
We study adaptive (or online) nonlinear regression with Long-Short-Term-Memory (LSTM) based networks, i.e., LSTM-based adaptive learning. In this context, we introduce an efficient Extended Kalman filter (EKF) based second-order training algorithm. Our algorithm is truly online, i.e., it does not assume any underlying data generating process and future information, except that the target sequence…
▽ More
We study adaptive (or online) nonlinear regression with Long-Short-Term-Memory (LSTM) based networks, i.e., LSTM-based adaptive learning. In this context, we introduce an efficient Extended Kalman filter (EKF) based second-order training algorithm. Our algorithm is truly online, i.e., it does not assume any underlying data generating process and future information, except that the target sequence is bounded. Through an extensive set of experiments, we demonstrate significant performance gains achieved by our algorithm with respect to the state-of-the-art methods. Here, we mainly show that our algorithm consistently provides 10 to 45\% improvement in the accuracy compared to the widely-used adaptive methods Adam, RMSprop, and DEKF, and comparable performance to EKF with a 10 to 15 times reduction in the run-time.
△ Less
Submitted 31 May, 2021; v1 submitted 22 October, 2019;
originally announced October 2019.
-
Universal Online Convex Optimization with Minimax Optimal Second-Order Dynamic Regret
Authors:
Hakan Gokcesu,
Suleyman S. Kozat
Abstract:
We introduce an online convex optimization algorithm which utilizes projected subgradient descent with optimal adaptive learning rates. Our method provides second-order minimax-optimal dynamic regret guarantee (i.e. dependent on the sum of squared subgradient norms) for a sequence of general convex functions, which may not have strong convexity, smoothness, exp-concavity or even Lipschitz-continui…
▽ More
We introduce an online convex optimization algorithm which utilizes projected subgradient descent with optimal adaptive learning rates. Our method provides second-order minimax-optimal dynamic regret guarantee (i.e. dependent on the sum of squared subgradient norms) for a sequence of general convex functions, which may not have strong convexity, smoothness, exp-concavity or even Lipschitz-continuity. The regret guarantee is against any comparator decision sequence with bounded path variation (i.e. sum of the distances between successive decisions). We generate the lower bound of the worst-case second-order dynamic regret by incorporating actual subgradient norms. We show that this lower bound matches with our regret guarantee within a constant factor, which makes our algorithm minimax optimal. We also derive the extension for learning in each decision coordinate individually. We demonstrate how to best preserve our regret guarantee in a truly online manner, when the bound on path variation of the comparator sequence grows in time or the feedback regarding such bound arrives partially as time goes on. We further build on our algorithm to eliminate the need of any knowledge on the comparator path variation, and provide minimax optimal second-order regret guarantees with no a priori information. Our approach can compete against all comparator sequences simultaneously (universally) in a minimax optimal manner, i.e. each regret guarantee depends on the respective comparator path variation. We discuss modifications to our approach which address complexity reductions for time, computation and memory. We further improve our results by making the regret guarantees also dependent on comparator sets' diameters in addition to the respective path variations.
△ Less
Submitted 13 September, 2022; v1 submitted 30 June, 2019;
originally announced July 2019.
-
Minimax Optimal Online Stochastic Learning for Sequences of Convex Functions under Sub-Gradient Observation Failures
Authors:
Hakan Gokcesu,
Suleyman S. Kozat
Abstract:
We study online convex optimization under stochastic sub-gradient observation faults, where we introduce adaptive algorithms with minimax optimal regret guarantees. We specifically study scenarios where our sub-gradient observations can be noisy or even completely missing in a stochastic manner. To this end, we propose algorithms based on sub-gradient descent method, which achieve tight minimax op…
▽ More
We study online convex optimization under stochastic sub-gradient observation faults, where we introduce adaptive algorithms with minimax optimal regret guarantees. We specifically study scenarios where our sub-gradient observations can be noisy or even completely missing in a stochastic manner. To this end, we propose algorithms based on sub-gradient descent method, which achieve tight minimax optimal regret bounds. When necessary, these algorithms utilize properties of the underlying stochastic settings to optimize their learning rates (step sizes). These optimizations are the main factor in providing the minimax optimal performance guarantees, especially when observations are stochastically missing. However, in real world scenarios, these properties of the underlying stochastic settings may not be revealed to the optimizer. For such a scenario, we propose a blind algorithm that estimates these properties empirically in a generally applicable manner. Through extensive experiments, we show that this empirical approach is a natural combination of regular stochastic gradient descent and the minimax optimal algorithms (which work best for randomized and adversarial function sequences, respectively).
△ Less
Submitted 19 April, 2019;
originally announced April 2019.
-
Sequential Outlier Detection based on Incremental Decision Trees
Authors:
Mohammadreza Mohaghegh Neyshabouri,
Suleyman Serdar Kozat
Abstract:
We introduce an online outlier detection algorithm to detect outliers in a sequentially observed data stream. For this purpose, we use a two-stage filtering and hedging approach. In the first stage, we construct a multi-modal probability density function to model the normal samples. In the second stage, given a new observation, we label it as an anomaly if the value of aforementioned density funct…
▽ More
We introduce an online outlier detection algorithm to detect outliers in a sequentially observed data stream. For this purpose, we use a two-stage filtering and hedging approach. In the first stage, we construct a multi-modal probability density function to model the normal samples. In the second stage, given a new observation, we label it as an anomaly if the value of aforementioned density function is below a specified threshold at the newly observed point. In order to construct our multi-modal density function, we use an incremental decision tree to construct a set of subspaces of the observation space. We train a single component density function of the exponential family using the observations, which fall inside each subspace represented on the tree. These single component density functions are then adaptively combined to produce our multi-modal density function, which is shown to achieve the performance of the best convex combination of the density functions defined on the subspaces. As we observe more samples, our tree grows and produces more subspaces. As a result, our modeling power increases in time, while mitigating overfitting issues. In order to choose our threshold level to label the observations, we use an adaptive thresholding scheme. We show that our adaptive threshold level achieves the performance of the optimal pre-fixed threshold level, which knows the observation labels in hindsight. Our algorithm provides significant performance improvements over the state of the art in our wide set of experiments involving both synthetic as well as real data.
△ Less
Submitted 9 March, 2018;
originally announced March 2018.
-
Unsupervised and Semi-supervised Anomaly Detection with LSTM Neural Networks
Authors:
Tolga Ergen,
Ali Hassan Mirza,
Suleyman Serdar Kozat
Abstract:
We investigate anomaly detection in an unsupervised framework and introduce Long Short Term Memory (LSTM) neural network based algorithms. In particular, given variable length data sequences, we first pass these sequences through our LSTM based structure and obtain fixed length sequences. We then find a decision function for our anomaly detectors based on the One Class Support Vector Machines (OC-…
▽ More
We investigate anomaly detection in an unsupervised framework and introduce Long Short Term Memory (LSTM) neural network based algorithms. In particular, given variable length data sequences, we first pass these sequences through our LSTM based structure and obtain fixed length sequences. We then find a decision function for our anomaly detectors based on the One Class Support Vector Machines (OC-SVM) and Support Vector Data Description (SVDD) algorithms. As the first time in the literature, we jointly train and optimize the parameters of the LSTM architecture and the OC-SVM (or SVDD) algorithm using highly effective gradient and quadratic programming based training methods. To apply the gradient based training method, we modify the original objective criteria of the OC-SVM and SVDD algorithms, where we prove the convergence of the modified objective criteria to the original criteria. We also provide extensions of our unsupervised formulation to the semi-supervised and fully supervised frameworks. Thus, we obtain anomaly detection algorithms that can process variable length data sequences while providing high performance, especially for time series data. Our approach is generic so that we also apply this approach to the Gated Recurrent Unit (GRU) architecture by directly replacing our LSTM based structure with the GRU based structure. In our experiments, we illustrate significant performance gains achieved by our algorithms with respect to the conventional methods.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.
-
Efficient Implementation Of Newton-Raphson Methods For Sequential Data Prediction
Authors:
Burak C. Civek,
Suleyman S. Kozat
Abstract:
We investigate the problem of sequential linear data prediction for real life big data applications. The second order algorithms, i.e., Newton-Raphson Methods, asymptotically achieve the performance of the "best" possible linear data predictor much faster compared to the first order algorithms, e.g., Online Gradient Descent. However, implementation of these methods is not usually feasible in big d…
▽ More
We investigate the problem of sequential linear data prediction for real life big data applications. The second order algorithms, i.e., Newton-Raphson Methods, asymptotically achieve the performance of the "best" possible linear data predictor much faster compared to the first order algorithms, e.g., Online Gradient Descent. However, implementation of these methods is not usually feasible in big data applications because of the extremely high computational needs. Regular implementation of the Newton-Raphson Methods requires a computational complexity in the order of $O(M^2)$ for an $M$ dimensional feature vector, while the first order algorithms need only $O(M)$. To this end, in order to eliminate this gap, we introduce a highly efficient implementation reducing the computational complexity of the Newton-Raphson Methods from quadratic to linear scale. The presented algorithm provides the well-known merits of the second order methods while offering the computational complexity of $O(M)$. We utilize the shifted nature of the consecutive feature vectors and do not rely on any statistical assumptions. Therefore, both regular and fast implementations achieve the same performance in the sense of mean square error. We demonstrate the computational efficiency of our algorithm on real life sequential big datasets. We also illustrate that the presented algorithm is numerically stable.
△ Less
Submitted 19 January, 2017;
originally announced January 2017.
-
Highly Efficient Hierarchical Online Nonlinear Regression Using Second Order Methods
Authors:
Burak C. Civek,
Ibrahim Delibalta,
Suleyman S. Kozat
Abstract:
We introduce highly efficient online nonlinear regression algorithms that are suitable for real life applications. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after being used. For nonlinear modeling we use a hierarchical piecewise linear approach based on the notion of decision trees where the space of the regressor vectors is adaptivel…
▽ More
We introduce highly efficient online nonlinear regression algorithms that are suitable for real life applications. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after being used. For nonlinear modeling we use a hierarchical piecewise linear approach based on the notion of decision trees where the space of the regressor vectors is adaptively partitioned based on the performance. As the first time in the literature, we learn both the piecewise linear partitioning of the regressor space as well as the linear models in each region using highly effective second order methods, i.e., Newton-Raphson Methods. Hence, we avoid the well known over fitting issues by using piecewise linear models, however, since both the region boundaries as well as the linear models in each region are trained using the second order methods, we achieve substantial performance compared to the state of the art. We demonstrate our gains over the well known benchmark data sets and provide performance results in an individual sequence manner guaranteed to hold without any statistical assumptions. Hence, the introduced algorithms address computational complexity issues widely encountered in real life applications while providing superior guaranteed performance in a strong deterministic sense.
△ Less
Submitted 18 January, 2017;
originally announced January 2017.
-
An Asymptotically Optimal Contextual Bandit Algorithm Using Hierarchical Structures
Authors:
Mohammadreza Mohaghegh Neyshabouri,
Kaan Gokcesu,
Huseyin Ozkan,
Suleyman S. Kozat
Abstract:
We propose online algorithms for sequential learning in the contextual multi-armed bandit setting. Our approach is to partition the context space and then optimally combine all of the possible map**s between the partition regions and the set of bandit arms in a data driven manner. We show that in our approach, the best map** is able to approximate the best arm selection policy to any desired d…
▽ More
We propose online algorithms for sequential learning in the contextual multi-armed bandit setting. Our approach is to partition the context space and then optimally combine all of the possible map**s between the partition regions and the set of bandit arms in a data driven manner. We show that in our approach, the best map** is able to approximate the best arm selection policy to any desired degree under mild Lipschitz conditions. Therefore, we design our algorithms based on the optimal adaptive combination and asymptotically achieve the performance of the best map** as well as the best arm selection policy. This optimality is also guaranteed to hold even in adversarial environments since we do not rely on any statistical assumptions regarding the contexts or the loss of the bandit arms. Moreover, we design efficient implementations for our algorithms in various hierarchical partitioning structures such as lexicographical or arbitrary position splitting and binary trees (and several other partitioning examples). For instance, in the case of binary tree partitioning, the computational complexity is only log-linear in the number of regions in the finest partition. In conclusion, we provide significant performance improvements by introducing upper bounds (w.r.t. the best arm selection policy) that are mathematically proven to vanish in the average loss per round sense at a faster rate compared to the state-of-the-art. Our experimental work extensively covers various scenarios ranging from bandit settings to multi-class classification with real and synthetic data. In these experiments, we show that our algorithms are highly superior over the state-of-the-art techniques while maintaining the introduced mathematical guarantees and a computationally decent scalability.
△ Less
Submitted 7 December, 2017; v1 submitted 5 December, 2016;
originally announced December 2016.
-
Team-Optimal Distributed MMSE Estimation in General and Tree Networks
Authors:
Muhammed O. Sayin,
Suleyman S. Kozat,
Tamer Başar
Abstract:
We construct team-optimal estimation algorithms over distributed networks for state estimation in the finite-horizon mean-square error (MSE) sense. Here, we have a distributed collection of agents with processing and cooperation capabilities. These agents observe noisy samples of a desired state through a linear model and seek to learn this state by interacting with each other. Although this probl…
▽ More
We construct team-optimal estimation algorithms over distributed networks for state estimation in the finite-horizon mean-square error (MSE) sense. Here, we have a distributed collection of agents with processing and cooperation capabilities. These agents observe noisy samples of a desired state through a linear model and seek to learn this state by interacting with each other. Although this problem has attracted significant attention and been studied extensively in fields including machine learning and signal processing, all the well-known strategies do not achieve team-optimal learning performance in the finite-horizon MSE sense. To this end, we formulate the finite-horizon distributed minimum MSE (MMSE) when there is no restriction on the size of the disclosed information, i.e., oracle performance, over an arbitrary network topology. Subsequently, we show that exchange of local estimates is sufficient to achieve the oracle performance only over certain network topologies. By inspecting these network structures, we propose recursive algorithms achieving the oracle performance through the disclosure of local estimates. For practical implementations we also provide approaches to reduce the complexity of the algorithms through the time-windowing of the observations. Finally, in the numerical examples, we demonstrate the superior performance of the introduced algorithms in the finite-horizon MSE sense due to optimal estimation.
△ Less
Submitted 8 January, 2017; v1 submitted 3 October, 2016;
originally announced October 2016.
-
Adaptive and Efficient Nonlinear Channel Equalization for Underwater Acoustic Communication
Authors:
Dariush Kari,
Nuri Denizcan Vanli,
Suleyman Serdar Kozat
Abstract:
We investigate underwater acoustic (UWA) channel equalization and introduce hierarchical and adaptive nonlinear channel equalization algorithms that are highly efficient and provide significantly improved bit error rate (BER) performance. Due to the high complexity of nonlinear equalizers and poor performance of linear ones, to equalize highly difficult underwater acoustic channels, we employ piec…
▽ More
We investigate underwater acoustic (UWA) channel equalization and introduce hierarchical and adaptive nonlinear channel equalization algorithms that are highly efficient and provide significantly improved bit error rate (BER) performance. Due to the high complexity of nonlinear equalizers and poor performance of linear ones, to equalize highly difficult underwater acoustic channels, we employ piecewise linear equalizers. However, in order to achieve the performance of the best piecewise linear model, we use a tree structure to hierarchically partition the space of the received signal. Furthermore, the equalization algorithm should be completely adaptive, since due to the highly non-stationary nature of the underwater medium, the optimal MSE equalizer as well as the best piecewise linear equalizer changes in time. To this end, we introduce an adaptive piecewise linear equalization algorithm that not only adapts the linear equalizer at each region but also learns the complete hierarchical structure with a computational complexity only polynomial in the number of nodes of the tree. Furthermore, our algorithm is constructed to directly minimize the final squared error without introducing any ad-hoc parameters. We demonstrate the performance of our algorithms through highly realistic experiments performed on accurately simulated underwater acoustic channels.
△ Less
Submitted 6 January, 2016;
originally announced January 2016.
-
A Novel Family of Boosted Online Regression Algorithms with Strong Theoretical Bounds
Authors:
Dariush Kari,
Farhan Khan,
Selami Ciftci,
Suleyman Serdar Kozat
Abstract:
We investigate boosted online regression and propose a novel family of regression algorithms with strong theoretical bounds. In addition, we implement several variants of the proposed generic algorithm. We specifically provide theoretical bounds for the performance of our proposed algorithms that hold in a strong mathematical sense. We achieve guaranteed performance improvement over the convention…
▽ More
We investigate boosted online regression and propose a novel family of regression algorithms with strong theoretical bounds. In addition, we implement several variants of the proposed generic algorithm. We specifically provide theoretical bounds for the performance of our proposed algorithms that hold in a strong mathematical sense. We achieve guaranteed performance improvement over the conventional online regression methods without any statistical assumptions on the desired data or feature vectors. We demonstrate an intrinsic relationship, in terms of boosting, between the adaptive mixture-of-experts and data reuse algorithms. Furthermore, we introduce a boosting algorithm based on random updates that is significantly faster than the conventional boosting methods and other variants of our proposed algorithms while achieving an enhanced performance gain. Hence, the random updates method is specifically applicable to the fast and high dimensional streaming data. Specifically, we investigate Newton Method-based and Stochastic Gradient Descent-based linear regression algorithms in a mixture-of-experts setting and provide several variants of these well-known adaptation methods. However, the proposed algorithms can be extended to other base learners, e.g., nonlinear, tree-based piecewise linear. Furthermore, we provide theoretical bounds for the computational complexity of our proposed algorithms. We demonstrate substantial performance gains in terms of mean square error over the base learners through an extensive set of benchmark real data sets and simulated examples.
△ Less
Submitted 6 December, 2016; v1 submitted 4 January, 2016;
originally announced January 2016.
-
A new robust adaptive algorithm for underwater acoustic channel equalization
Authors:
Dariush Kari,
Muhammed Omer Sayin,
Suleyman Serdar Kozat
Abstract:
We introduce a novel family of adaptive robust equalizers for highly challenging underwater acoustic (UWA) channel equalization. Since the underwater environment is highly non-stationary and subjected to impulsive noise, we use adaptive filtering techniques based on a relative logarithmic cost function inspired by the competitive methods from the online learning literature. To improve the converge…
▽ More
We introduce a novel family of adaptive robust equalizers for highly challenging underwater acoustic (UWA) channel equalization. Since the underwater environment is highly non-stationary and subjected to impulsive noise, we use adaptive filtering techniques based on a relative logarithmic cost function inspired by the competitive methods from the online learning literature. To improve the convergence performance of the conventional linear equalization methods, while mitigating the stability issues, we intrinsically combine different norms of the error in the cost function, using logarithmic functions. Hence, we achieve a comparable convergence performance to least mean fourth (LMF) equalizer, while significantly enhancing the stability performance in such an adverse communication medium. We demonstrate the performance of our algorithms through highly realistic experiments performed on accurately simulated underwater acoustic channels.
△ Less
Submitted 19 December, 2015;
originally announced December 2015.
-
Data Imputation through the Identification of Local Anomalies
Authors:
Huseyin Ozkan,
Ozgun S. Pelvan,
Suleyman S. Kozat
Abstract:
We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose i) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and ii) a Maximum…
▽ More
We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose i) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose a novel distance measure, which is based on the ranked deviations among the data attributes and empirically shown to be superior in separating the corruptions. Our algorithm first splits the suspicious instance into parts through a binary partitioning tree in the space of data attributes and iteratively tests those parts to detect local anomalies using the nominal statistics extracted from an uncorrupted (clean) reference data set. Once each part is labeled as anomalous vs normal, the corresponding binary patterns over this tree that characterize corruptions are identified and the affected attributes are imputed. Under a certain conditional independency structure assumed for the binary patterns, we analytically show that the false alarm rate of the introduced algorithm in detecting the corruptions is independent of the data and can be directly set without any parameter tuning. The proposed framework is tested over several well-known machine learning data sets with synthetically generated corruptions; and experimentally shown to produce remarkable improvements in terms of classification purposes with strong corruption separation capabilities. Our experiments also indicate that the proposed algorithms outperform the typical approaches and are robust to varying training phase conditions.
△ Less
Submitted 30 September, 2014;
originally announced September 2014.
-
Compressive Diffusion Strategies Over Distributed Networks for Reduced Communication Load
Authors:
Muhammed O. Sayin,
Suleyman S. Kozat
Abstract:
We study the compressive diffusion strategies over distributed networks based on the diffusion implementation and adaptive extraction of the information from the compressed diffusion data. We demonstrate that one can achieve a comparable performance with the full information exchange configurations, even if the diffused information is compressed into a scalar or a single bit. To this end, we provi…
▽ More
We study the compressive diffusion strategies over distributed networks based on the diffusion implementation and adaptive extraction of the information from the compressed diffusion data. We demonstrate that one can achieve a comparable performance with the full information exchange configurations, even if the diffused information is compressed into a scalar or a single bit. To this end, we provide a complete performance analysis for the compressive diffusion strategies. We analyze the transient, steady-state and tracking performance of the configurations in which the diffused data is compressed into a scalar or a single-bit. We propose a new adaptive combination method improving the convergence performance of the compressive diffusion strategies further. In the new method, we introduce one more freedom-of-dimension in the combination matrix and adapt it by using the conventional mixture approach in order to enhance the convergence performance for any possible combination rule used for the full diffusion configuration. We demonstrate that our theoretical analysis closely follow the ensemble averaged results in our simulations. We provide numerical examples showing the improved convergence performance with the new adaptive combination method.
△ Less
Submitted 5 February, 2014;
originally announced February 2014.
-
Predicting Nearly As Well As the Optimal Twice Differentiable Regressor
Authors:
N. Denizcan Vanli,
Muhammed O. Sayin,
Suleyman S. Kozat
Abstract:
We study nonlinear regression of real valued data in an individual sequence manner, where we provide results that are guaranteed to hold without any statistical assumptions. We address the convergence and undertraining issues of conventional nonlinear regression methods and introduce an algorithm that elegantly mitigates these issues via an incremental hierarchical structure, (i.e., via an increme…
▽ More
We study nonlinear regression of real valued data in an individual sequence manner, where we provide results that are guaranteed to hold without any statistical assumptions. We address the convergence and undertraining issues of conventional nonlinear regression methods and introduce an algorithm that elegantly mitigates these issues via an incremental hierarchical structure, (i.e., via an incremental decision tree). Particularly, we present a piecewise linear (or nonlinear) regression algorithm that partitions the regressor space in a data driven manner and learns a linear model at each region. Unlike the conventional approaches, our algorithm gradually increases the number of disjoint partitions on the regressor space in a sequential manner according to the observed data. Through this data driven approach, our algorithm sequentially and asymptotically achieves the performance of the optimal twice differentiable regression function for any data sequence with an unknown and arbitrary length. The computational complexity of the introduced algorithm is only logarithmic in the data length under certain regularity conditions. We provide the explicit description of the algorithm and demonstrate the significant gains for the well-known benchmark real data sets and chaotic signals.
△ Less
Submitted 6 October, 2014; v1 submitted 23 January, 2014;
originally announced January 2014.
-
A Novel Family of Adaptive Filtering Algorithms Based on The Logarithmic Cost
Authors:
Muhammed O. Sayin,
N. Denizcan Vanli,
Suleyman S. Kozat
Abstract:
We introduce a novel family of adaptive filtering algorithms based on a relative logarithmic cost. The new family intrinsically combines the higher and lower order measures of the error into a single continuous update based on the error amount. We introduce important members of this family of algorithms such as the least mean logarithmic square (LMLS) and least logarithmic absolute difference (LLA…
▽ More
We introduce a novel family of adaptive filtering algorithms based on a relative logarithmic cost. The new family intrinsically combines the higher and lower order measures of the error into a single continuous update based on the error amount. We introduce important members of this family of algorithms such as the least mean logarithmic square (LMLS) and least logarithmic absolute difference (LLAD) algorithms that improve the convergence performance of the conventional algorithms. However, our approach and analysis are generic such that they cover other well-known cost functions as described in the paper. The LMLS algorithm achieves comparable convergence performance with the least mean fourth (LMF) algorithm and extends the stability bound on the step size. The LLAD and least mean square (LMS) algorithms demonstrate similar convergence performance in impulse-free noise environments while the LLAD algorithm is robust against impulsive interferences and outperforms the sign algorithm (SA). We analyze the transient, steady state and tracking performance of the introduced algorithms and demonstrate the match of the theoretical analyzes and simulation results. We show the extended stability bound of the LMLS algorithm and analyze the robustness of the LLAD algorithm against impulsive interferences. Finally, we demonstrate the performance of our algorithms in different scenarios through numerical examples.
△ Less
Submitted 26 November, 2013;
originally announced November 2013.
-
A Unified Approach to Universal Prediction: Generalized Upper and Lower Bounds
Authors:
N. Denizcan Vanli,
Suleyman S. Kozat
Abstract:
We study sequential prediction of real-valued, arbitrary and unknown sequences under the squared error loss as well as the best parametric predictor out of a large, continuous class of predictors. Inspired by recent results from computational learning theory, we refrain from any statistical assumptions and define the performance with respect to the class of general parametric predictors. In partic…
▽ More
We study sequential prediction of real-valued, arbitrary and unknown sequences under the squared error loss as well as the best parametric predictor out of a large, continuous class of predictors. Inspired by recent results from computational learning theory, we refrain from any statistical assumptions and define the performance with respect to the class of general parametric predictors. In particular, we present generic lower and upper bounds on this relative performance by transforming the prediction task into a parameter learning problem. We first introduce the lower bounds on this relative performance in the mixture of experts framework, where we show that for any sequential algorithm, there always exists a sequence for which the performance of the sequential algorithm is lower bounded by zero. We then introduce a sequential learning algorithm to predict such arbitrary and unknown sequences, and calculate upper bounds on its total squared prediction error for every bounded sequence. We further show that in some scenarios we achieve matching lower and upper bounds demonstrating that our algorithms are optimal in a strong minimax sense such that their performances cannot be improved further. As an interesting result we also prove that for the worst case scenario, the performance of randomized algorithms can be achieved by sequential algorithms so that randomized algorithms does not improve the performance.
△ Less
Submitted 22 January, 2014; v1 submitted 25 November, 2013;
originally announced November 2013.
-
A Comprehensive Approach to Universal Piecewise Nonlinear Regression Based on Trees
Authors:
N. Denizcan Vanli,
Suleyman S. Kozat
Abstract:
In this paper, we investigate adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We use a tree notion in order to partition the space of regressors in a nested structure. The introduced algorithms adapt not only their reg…
▽ More
In this paper, we investigate adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We use a tree notion in order to partition the space of regressors in a nested structure. The introduced algorithms adapt not only their regression functions but also the complete tree structure while achieving the performance of the "best" linear mixture of a doubly exponential number of partitions, with a computational complexity only polynomial in the number of nodes of the tree. While constructing these algorithms, we also avoid using any artificial "weighting" of models (with highly data dependent parameters) and, instead, directly minimize the final regression error, which is the ultimate performance goal. The introduced methods are generic such that they can readily incorporate different tree construction methods such as random trees in their framework and can use different regressor or partitioning functions as demonstrated in the paper.
△ Less
Submitted 27 December, 2013; v1 submitted 25 November, 2013;
originally announced November 2013.