Search | arXiv e-print repository

Smoothed Analysis of Sequential Probability Assignment

Authors: Alankrita Bhatt, Nika Haghtalab, Abhishek Shetty

Abstract: We initiate the study of smoothed analysis for the sequential probability assignment problem with contexts. We study information-theoretically optimal minmax rates as well as a framework for algorithmic reduction involving the maximum likelihood estimator oracle. Our approach establishes a general-purpose reduction from minimax rates for sequential probability assignment for smoothed adversaries t… ▽ More We initiate the study of smoothed analysis for the sequential probability assignment problem with contexts. We study information-theoretically optimal minmax rates as well as a framework for algorithmic reduction involving the maximum likelihood estimator oracle. Our approach establishes a general-purpose reduction from minimax rates for sequential probability assignment for smoothed adversaries to minimax rates for transductive learning. This leads to optimal (logarithmic) fast rates for parametric classes and classes with finite VC dimension. On the algorithmic front, we develop an algorithm that efficiently taps into the MLE oracle, for general classes of functions. We show that under general conditions this algorithmic approach yields sublinear regret. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2207.12382 [pdf, other]

On Confidence Sequences for Bounded Random Processes via Universal Gambling Strategies

Authors: J. Jon Ryu, Alankrita Bhatt

Abstract: This paper considers the problem of constructing a confidence sequence, which is a sequence of confidence intervals that hold uniformly over time, for estimating the mean of bounded real-valued random processes. This paper revisits the gambling-based approach established in the recent literature from a natural \emph{two-horse race} perspective, and demonstrates new properties of the resulting algo… ▽ More This paper considers the problem of constructing a confidence sequence, which is a sequence of confidence intervals that hold uniformly over time, for estimating the mean of bounded real-valued random processes. This paper revisits the gambling-based approach established in the recent literature from a natural \emph{two-horse race} perspective, and demonstrates new properties of the resulting algorithm induced by Cover (1991)'s universal portfolio. The main result of this paper is a new algorithm based on a mixture of lower bounds, which closely approximates the performance of Cover's universal portfolio with constant per-round time complexity. A higher-order generalization of a lower bound on a logarithmic function in (Fan et al., 2015), which is developed as a key technique for the proposed algorithm, may be of independent interest. △ Less

Submitted 4 February, 2024; v1 submitted 25 July, 2022; originally announced July 2022.

Comments: 32 pages, 3 figures

arXiv:1910.01570 [pdf, other]

Prediction of GNSS Phase Scintillations: A Machine Learning Approach

Authors: Kara Lamb, Garima Malhotra, Athanasios Vlontzos, Edward Wagstaff, Atılım Günes Baydin, Anahita Bhiwandiwalla, Yarin Gal, Alfredo Kalaitzis, Anthony Reina, Asti Bhatt

Abstract: A Global Navigation Satellite System (GNSS) uses a constellation of satellites around the earth for accurate navigation, timing, and positioning. Natural phenomena like space weather introduce irregularities in the Earth's ionosphere, disrupting the propagation of the radio signals that GNSS relies upon. Such disruptions affect both the amplitude and the phase of the propagated waves. No physics-b… ▽ More A Global Navigation Satellite System (GNSS) uses a constellation of satellites around the earth for accurate navigation, timing, and positioning. Natural phenomena like space weather introduce irregularities in the Earth's ionosphere, disrupting the propagation of the radio signals that GNSS relies upon. Such disruptions affect both the amplitude and the phase of the propagated waves. No physics-based model currently exists to predict the time and location of these disruptions with sufficient accuracy and at relevant scales. In this paper, we focus on predicting the phase fluctuations of GNSS radio waves, known as phase scintillations. We propose a novel architecture and loss function to predict 1 hour in advance the magnitude of phase scintillations within a time window of plus-minus 5 minutes with state-of-the-art performance. △ Less

Submitted 3 October, 2019; originally announced October 2019.

Comments: First 4 authors contributed equally Paper accepted in Machine Learning for the Physical Sciences workshop of NeurIPS 2019 Camera Ready Version to Follow

arXiv:1902.05605 [pdf, other]

CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity

Authors: Aditya Bhatt, Daniel Palenicek, Boris Belousov, Max Argus, Artemij Amiranashvili, Thomas Brox, Jan Peters

Abstract: Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the critic per environment sample. However, this comes at the expense of a greatly increased computational cost. To reduce this computational burden, we introduce Cro… ▽ More Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the critic per environment sample. However, this comes at the expense of a greatly increased computational cost. To reduce this computational burden, we introduce CrossQ: A lightweight algorithm for continuous control tasks that makes careful use of Batch Normalization and removes target networks to surpass the current state-of-the-art in sample efficiency while maintaining a low UTD ratio of 1. Notably, CrossQ does not rely on advanced bias-reduction schemes used in current methods. CrossQ's contributions are threefold: (1) it matches or surpasses current state-of-the-art methods in terms of sample efficiency, (2) it substantially reduces the computational cost compared to REDQ and DroQ, (3) it is easy to implement, requiring just a few lines of code on top of SAC. △ Less

Submitted 25 March, 2024; v1 submitted 14 February, 2019; originally announced February 2019.

Comments: Published at ICLR 2024. Project page at http://aditya.bhatts.org/CrossQ and code release at https://github.com/adityab/CrossQ

arXiv:1902.02441 [pdf, other]

Artificial Intelligence for Prosthetics - challenge solutions

Authors: Łukasz Kidziński, Carmichael Ong, Sharada Prasanna Mohanty, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian, Hao Tian, Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungström, Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu, Jun Huang , et al. (25 additional authors not shown)

Abstract: In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many s… ▽ More In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many solutions use similar relaxations and heuristics, such as reward sha**, frame skip**, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning. △ Less

Submitted 6 February, 2019; originally announced February 2019.

arXiv:1612.00497 [pdf, other]

Opioid Atlas: Map** Access to Pain Medication

Authors: Kris Sankaran, Suzanne Tamang, Ami Bhatt

Abstract: Opiates are some of the most effective pain relief medications available for patients suffering from cancer and surgery-related pain. Despite the affordability and effectiveness of these medications, access to opiates is highly geographically variable. Pain researchers have attributed geographic variation to various factors including the fear of opioid addiction, diversion of legal opiods to the u… ▽ More Opiates are some of the most effective pain relief medications available for patients suffering from cancer and surgery-related pain. Despite the affordability and effectiveness of these medications, access to opiates is highly geographically variable. Pain researchers have attributed geographic variation to various factors including the fear of opioid addiction, diversion of legal opiods to the underground market and pharmaceutical industry influences. However, the extent to which there is inequity in untreated cancer and surgery-related pain is unknown. To help opioid investigators study these questions, we designed a tool, the Opioid Atlas, for exploring data on legal opioid consumption, by country and time, collected by the International Narcotics Control Board. Our design borrows ideas from the data visualization and multivariate statistics communities, especially the principles of linking and dimensionality reduction. Our work is relevant to policymakers and pain researchers who wish to systematically assess country-level factors that contribute to differences in opioid access for patients with cancer and surgery-related pain. The Opioid Atlas, and the code behind it, is freely available with an open source license. △ Less

Submitted 1 December, 2016; originally announced December 2016.

Showing 1–6 of 6 results for author: Bhatt, A