Search | arXiv e-print repository

Efficiently Esca** Saddle Points for Non-Convex Policy Optimization

Authors: Sadegh Khorasani, Saber Salehkaleybar, Negar Kiyavash, Niao He, Matthias Grossglauser

Abstract: Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance. In recent years, several variance-reduced PG methods have been proposed with a theoretical guarantee of converging to an approximate first-order stationary point (FOSP) with the sample complexity of $O(ε^{-3})$. However, FOSPs could be bad local optima or saddle points. Moreover, these algori… ▽ More Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance. In recent years, several variance-reduced PG methods have been proposed with a theoretical guarantee of converging to an approximate first-order stationary point (FOSP) with the sample complexity of $O(ε^{-3})$. However, FOSPs could be bad local optima or saddle points. Moreover, these algorithms often use importance sampling (IS) weights which could impair the statistical effectiveness of variance reduction. In this paper, we propose a variance-reduced second-order method that uses second-order information in the form of Hessian vector products (HVP) and converges to an approximate second-order stationary point (SOSP) with sample complexity of $\tilde{O}(ε^{-3})$. This rate improves the best-known sample complexity for achieving approximate SOSPs by a factor of $O(ε^{-0.5})$. Moreover, the proposed variance reduction technique bypasses IS weights by using HVP terms. Our experimental results show that the proposed algorithm outperforms the state of the art and is more robust to changes in random seeds. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: arXiv admin note: text overlap with arXiv:2205.08253

MSC Class: ACM-class:I.2.6

arXiv:2302.05095 [pdf]

A Review on Orbital Angular Momentum With the Approach of Using in Fifth-Generation Mobile Communications

Authors: Seyed Ali Khorasani

Abstract: In this paper, different generations of mobile communication have been concisely mentioned. The need for advanced antenna systems capable of sending and receiving massive data is felt in the fifth generation of mobile communication. The beamforming method and multi-input multi-output systems (MIMO) are the proposed solutions to increase the channel capacity of the communication network. Orbital an… ▽ More In this paper, different generations of mobile communication have been concisely mentioned. The need for advanced antenna systems capable of sending and receiving massive data is felt in the fifth generation of mobile communication. The beamforming method and multi-input multi-output systems (MIMO) are the proposed solutions to increase the channel capacity of the communication network. Orbital angular momentum (OAM), an inherent feature of electromagnetic waves, is a suitable solution to increase channel capacity. This feature will increase the channel capacity by producing orthogonal modes. Using antenna arrays is an effective way to produce these modes. The results of FEKO simulations show the capability of this method. △ Less

Submitted 10 February, 2023; originally announced February 2023.

Comments: 5 pages, 7 figures, 1 table

arXiv:2205.08253 [pdf, other]

Momentum-Based Policy Gradient with Second-Order Information

Authors: Saber Salehkaleybar, Sadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran

Abstract: Variance-reduced gradient estimators for policy gradient methods have been one of the main focus of research in the reinforcement learning in recent years as they allow acceleration of the estimation process. We propose a variance-reduced policy-gradient method, called SHARP, which incorporates second-order information into stochastic gradient descent (SGD) using momentum with a time-varying learn… ▽ More Variance-reduced gradient estimators for policy gradient methods have been one of the main focus of research in the reinforcement learning in recent years as they allow acceleration of the estimation process. We propose a variance-reduced policy-gradient method, called SHARP, which incorporates second-order information into stochastic gradient descent (SGD) using momentum with a time-varying learning rate. SHARP algorithm is parameter-free, achieving $ε$-approximate first-order stationary point with $O(ε^{-3})$ number of trajectories, while using a batch size of $O(1)$ at each iteration. Unlike most previous work, our proposed algorithm does not require importance sampling which can compromise the advantage of variance reduction process. Moreover, the variance of estimation error decays with the fast rate of $O(1/t^{2/3})$ where $t$ is the number of iterations. Our extensive experimental evaluations show the effectiveness of the proposed algorithm on various control tasks and its advantage over the state of the art in practice. △ Less

Submitted 26 November, 2023; v1 submitted 17 May, 2022; originally announced May 2022.

arXiv:2110.03706 [pdf, other]

SVG-Net: An SVG-based Trajectory Prediction Model

Authors: Mohammadhossein Bahari, Vahid Zehtab, Sadegh Khorasani, Sana Ayromlou, Saeed Saadatnejad, Alexandre Alahi

Abstract: Anticipating motions of vehicles in a scene is an essential problem for safe autonomous driving systems. To this end, the comprehension of the scene's infrastructure is often the main clue for predicting future trajectories. Most of the proposed approaches represent the scene with a rasterized format and some of the more recent approaches leverage custom vectorized formats. In contrast, we propose… ▽ More Anticipating motions of vehicles in a scene is an essential problem for safe autonomous driving systems. To this end, the comprehension of the scene's infrastructure is often the main clue for predicting future trajectories. Most of the proposed approaches represent the scene with a rasterized format and some of the more recent approaches leverage custom vectorized formats. In contrast, we propose representing the scene's information by employing Scalable Vector Graphics (SVG). SVG is a well-established format that matches the problem of trajectory prediction better than rasterized formats while being more general than arbitrary vectorized formats. SVG has the potential to provide the convenience and generality of raster-based solutions if coupled with a powerful tool such as CNNs, for which we introduce SVG-Net. SVG-Net is a Transformer-based Neural Network that can effectively capture the scene's information from SVG inputs. Thanks to the self-attention mechanism in its Transformers, SVG-Net can also adequately apprehend relations amongst the scene and the agents. We demonstrate SVG-Net's effectiveness by evaluating its performance on the publicly available Argoverse forecasting dataset. Finally, we illustrate how, by using SVG, one can benefit from datasets and advancements in other research fronts that also utilize the same input format. Our code is available at https://vita-epfl.github.io/SVGNet/. △ Less

Submitted 11 October, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

MSC Class: 68T45 68T07 ACM Class: I.2.10; I.5.1; I.5.4

arXiv:1910.06239 [pdf, other]

Two-sample Testing Using Deep Learning

Authors: Matthias Kirchler, Shahryar Khorasani, Marius Kloft, Christoph Lippert

Abstract: We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are consistent and asymptotically control the type-1 error rate. Their test statistics can be evaluated in linear time (in the sample size). Suitable data representat… ▽ More We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are consistent and asymptotically control the type-1 error rate. Their test statistics can be evaluated in linear time (in the sample size). Suitable data representations are obtained in a data-driven way, by solving a supervised or unsupervised transfer-learning task on an auxiliary (potentially distinct) data set. If no auxiliary data is available, we split the data into two chunks: one for learning representations and one for computing the test statistic. In experiments on audio samples, natural images and three-dimensional neuroimaging data our tests yield significant decreases in type-2 error rate (up to 35 percentage points) compared to state-of-the-art two-sample tests such as kernel-methods and classifier two-sample tests. △ Less

Submitted 10 March, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

arXiv:1906.10496 [pdf]

The Coming Age of Pervasive Data Processing

Authors: Jan S. Rellermeyer, Sobhan Omranian Khorasani, Dan Graur, Apourva Parthasarathy

Abstract: Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of data-intensive workloads, the ever-increasing demand of applications have made us reconsider the traditional ways of scaling (e.g., scale-out) and seek new oppo… ▽ More Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of data-intensive workloads, the ever-increasing demand of applications have made us reconsider the traditional ways of scaling (e.g., scale-out) and seek new opportunities for improving the performance. In order to prepare for an era where data collection and processing occur on a wide range of devices, from powerful HPC machines to small embedded devices, it is crucial to investigate and eliminate the potential sources of inefficiency in the current state of the art platforms. In this paper, we address the current and upcoming challenges of pervasive data processing and present directions for designing the next generation of large-scale data processing systems. △ Less

Submitted 21 June, 2019; originally announced June 2019.

Comments: ISPDC 2019

arXiv:1812.00448 [pdf, other]

Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer's disease

Authors: Stefan Konigorski, Shahryar Khorasani, Christoph Lippert

Abstract: For precision medicine and personalized treatment, we need to identify predictive markers of disease. We focus on Alzheimer's disease (AD), where magnetic resonance imaging scans provide information about the disease status. By combining imaging with genome sequencing, we aim at identifying rare genetic markers associated with quantitative traits predicted from convolutional neural networks (CNNs)… ▽ More For precision medicine and personalized treatment, we need to identify predictive markers of disease. We focus on Alzheimer's disease (AD), where magnetic resonance imaging scans provide information about the disease status. By combining imaging with genome sequencing, we aim at identifying rare genetic markers associated with quantitative traits predicted from convolutional neural networks (CNNs), which traditionally have been derived manually by experts. Kernel-based tests are a powerful tool for associating sets of genetic variants, but how to optimally model rare genetic variants is still an open research question. We propose a generalized set of kernels that incorporate prior information from various annotations and multi-omics data. In the analysis of data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), we evaluate whether (i) CNNs yield precise and reliable brain traits, and (ii) the novel kernel-based tests can help to identify loci associated with AD. The results indicate that CNNs provide a fast, scalable and precise tool to derive quantitative AD traits and that new kernels integrating domain knowledge can yield higher power in association tests of very rare variants. △ Less

Submitted 5 March, 2019; v1 submitted 2 December, 2018; originally announced December 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018, arXiv:1811.07216

Report number: ML4H/2018/129

Showing 1–7 of 7 results for author: Khorasani, S