Search | arXiv e-print repository

SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning

Authors: Shuai Zhang, Heshan Devaka Fernando, Miao Liu, Keerthiram Murugesan, Songtao Lu, Pin-Yu Chen, Tianyi Chen, Meng Wang

Abstract: This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics. In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward map**: the former characterizes the transition dynamics, and the latter characterizes the task-specif… ▽ More This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics. In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward map**: the former characterizes the transition dynamics, and the latter characterizes the task-specific reward function. This Q-function decomposition, coupled with a policy improvement operator known as generalized policy improvement (GPI), reduces the sample complexity of finding the optimal Q-function, and thus the SF \& GPI framework exhibits promising empirical performance compared to traditional RL methods like Q-learning. However, its theoretical foundations remain largely unestablished, especially when learning the successor features using deep neural networks (SF-DQN). This paper studies the provable knowledge transfer using SFs-DQN in transfer RL problems. We establish the first convergence analysis with provable generalization guarantees for SF-DQN with GPI. The theory reveals that SF-DQN with GPI outperforms conventional RL approaches, such as deep Q-network, in terms of both faster convergence rate and better generalization. Numerical experiments on real and synthetic RL tasks support the superior performance of SF-DQN \& GPI, aligning with our theoretical findings. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.16173

arXiv:2403.16790 [pdf, other]

Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise

Authors: Dilum Fernando, Dhananjaya jayasundara, Roshan Godaliyadda, Chaminda Bandara, Parakrama Ekanayake, Vijitha Herath

Abstract: Denoising Diffusion Probabilistic Models (DDPMs) have accomplished much in the realm of generative AI. Despite their high performance, there is room for improvement, especially in terms of sample fidelity by utilizing statistical properties that impose structural integrity, such as isotropy. Minimizing the mean squared error between the additive and predicted noise alone does not impose constraint… ▽ More Denoising Diffusion Probabilistic Models (DDPMs) have accomplished much in the realm of generative AI. Despite their high performance, there is room for improvement, especially in terms of sample fidelity by utilizing statistical properties that impose structural integrity, such as isotropy. Minimizing the mean squared error between the additive and predicted noise alone does not impose constraints on the predicted noise to be isotropic. Thus, we were motivated to utilize the isotropy of the additive noise as a constraint on the objective function to enhance the fidelity of DDPMs. Our approach is simple and can be applied to any DDPM variant. We validate our approach by presenting experiments conducted on four synthetic 2D datasets as well as on unconditional image generation. As demonstrated by the results, the incorporation of this constraint improves the fidelity metrics, Precision and Density for the 2D datasets as well as for the unconditional image generation. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2303.17829 [pdf]

Evaluation of Noise Reduction Methods for Sentence Recognition by Sinhala Speaking Listeners

Authors: Malitha Gunawardhana, Chathuki Navanjana, Dinithi Fernando, Nipuna Upeksha, Anjula De Silva

Abstract: Noise reduction is a crucial aspect of hearing aids, which researchers have been striving to address over the years. However, most existing noise reduction algorithms have primarily been evaluated using English. Considering the linguistic differences between English and Sinhala languages, including variation in syllable structures and vowel duration, it is very important to assess the performance… ▽ More Noise reduction is a crucial aspect of hearing aids, which researchers have been striving to address over the years. However, most existing noise reduction algorithms have primarily been evaluated using English. Considering the linguistic differences between English and Sinhala languages, including variation in syllable structures and vowel duration, it is very important to assess the performance of noise reduction tailored to the Sinhala language. This paper presents a comprehensive analysis between wavelet transformation and adaptive filters for noise reduction in Sinhala languages. We investigate the performance of ten wavelet families with soft and hard thresholding methods against adaptive filters with Normalized Least Mean Square, Least Mean Square Average Normalized Least Mean Square, Recursive Least Square, and Adaptive Filtering Averaging optimization algorithms along with cepstral and energy-based voice activity detection algorithms. The performance evaluation is done using objective metrics; Signal to Noise Ratio (SNR) and Perceptual Evaluation of Speech Quality (PESQ) and a subjective metric; Mean Opinion Score (MOS). A newly recorded Sinhala language audio dataset and the NOIZEUS database by the University of Texas, Dallas were used for the evaluation. Our code is available at https://github.com/ChathukiKet/Evaluation-of-Noise-Reduction-Methods △ Less

Submitted 27 June, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

arXiv:1908.09775 [pdf, other]

Multi-Path Learnable Wavelet Neural Network for Image Classification

Authors: D. D. N. De Silva, H. W. M. K. Vithanage, K. S. D. Fernando, I. T. S. Piyatilake

Abstract: Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network architecture for image classification with far less number of trainable parameters. The model architecture consists of a multi-path layout with several levels of wavele… ▽ More Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network architecture for image classification with far less number of trainable parameters. The model architecture consists of a multi-path layout with several levels of wavelet decompositions performed in parallel followed by fully connected layers. These decomposition operations comprise wavelet neurons with learnable parameters, which are updated during the training phase using the back-propagation algorithm. We evaluate the performance of the introduced network using common image datasets without data augmentation except for SVHN and compare the results with influential deep learning models. Our findings support the possibility of reducing the number of parameters significantly in deep neural networks without compromising its accuracy. △ Less

Submitted 26 August, 2019; originally announced August 2019.

Showing 1–4 of 4 results for author: Fernando, D