Search | arXiv e-print repository

Naive Bayes-based Context Extension for Large Language Models

Authors: Jianlin Su, Murtadha Ahmed, Wenbo, Luo Ao, Mingren Zhu, Yunfeng Liu

Abstract: Large Language Models (LLMs) have shown promising in-context learning abilities. However, conventional In-Context Learning (ICL) approaches are often impeded by length limitations of transformer architecture, which pose challenges when attempting to effectively integrate supervision from a substantial number of demonstration examples. In this paper, we introduce a novel framework, called Naive Bay… ▽ More Large Language Models (LLMs) have shown promising in-context learning abilities. However, conventional In-Context Learning (ICL) approaches are often impeded by length limitations of transformer architecture, which pose challenges when attempting to effectively integrate supervision from a substantial number of demonstration examples. In this paper, we introduce a novel framework, called Naive Bayes-based Context Extension (NBCE), to enable existing LLMs to perform ICL with an increased number of demonstrations by significantly expanding their context size. Importantly, this expansion does not require fine-tuning or dependence on particular model architectures, all the while preserving linear efficiency. NBCE initially splits the context into equal-sized windows fitting the target LLM's maximum length. Then, it introduces a voting mechanism to select the most relevant window, regarded as the posterior context. Finally, it employs Bayes' theorem to generate the test task. Our experimental results demonstrate that NBCE substantially enhances performance, particularly as the number of demonstration examples increases, consistently outperforming alternative methods. The NBCE code will be made publicly accessible. The code NBCE is available at: https://github.com/amurtadha/NBCE-master △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted to main NAACL 2024

arXiv:2401.07462 [pdf, other]

doi 10.1140/epjc/s10052-024-12770-1

Nonproportionality of NaI(Tl) Scintillation Detector for Dark Matter Search Experiments

Authors: S. M. Lee, G. Adhikari, N. Carlin, J. Y. Cho, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Fran. a, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, S. W. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim , et al. (37 additional authors not shown)

Abstract: We present a comprehensive study of the nonproportionality of NaI(Tl) scintillation detectors within the context of dark matter search experiments. Our investigation, which integrates COSINE-100 data with supplementary $γ$ spectroscopy, measures light yields across diverse energy levels from full-energy $γ$ peaks produced by the decays of various isotopes. These $γ$ peaks of interest were produced… ▽ More We present a comprehensive study of the nonproportionality of NaI(Tl) scintillation detectors within the context of dark matter search experiments. Our investigation, which integrates COSINE-100 data with supplementary $γ$ spectroscopy, measures light yields across diverse energy levels from full-energy $γ$ peaks produced by the decays of various isotopes. These $γ$ peaks of interest were produced by decays supported by both long and short-lived isotopes. Analyzing peaks from decays supported only by short-lived isotopes presented a unique challenge due to their limited statistics and overlap** energies, which was overcome by long-term data collection and a time-dependent analysis. A key achievement is the direct measurement of the 0.87 keV light yield, resulting from the cascade following electron capture decay of $^{22}$Na from internal contamination. This measurement, previously accessible only indirectly, deepens our understanding of NaI(Tl) scintillator behavior in the region of interest for dark matter searches. This study holds substantial implications for background modeling and the interpretation of dark matter signals in NaI(Tl) experiments. △ Less

Submitted 10 May, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

Comments: 12 pages, 7 figures

Journal ref: Eur. Phys. J. C 84 (2024) 484

arXiv:2311.18699 [pdf, other]

Gaussian processes Correlated Bayesian Additive Regression Trees

Authors: Xuetao Lu a, Robert E. McCulloch

Abstract: In recent years, Bayesian Additive Regression Trees (BART) has garnered increased attention, leading to the development of various extensions for diverse applications. However, there has been limited exploration of its utility in analyzing correlated data. This paper introduces a novel extension of BART, named Correlated BART (CBART). Unlike the original BART with independent errors, CBART is spec… ▽ More In recent years, Bayesian Additive Regression Trees (BART) has garnered increased attention, leading to the development of various extensions for diverse applications. However, there has been limited exploration of its utility in analyzing correlated data. This paper introduces a novel extension of BART, named Correlated BART (CBART). Unlike the original BART with independent errors, CBART is specifically designed to handle correlated (dependent) errors. Additionally, we propose the integration of CBART with Gaussian processes (GP) to create a new model termed GP-CBART. This innovative model combines the strengths of the Gaussian processes and CBART, making it particularly well-suited for analyzing time series or spatial data. In the GP-CBART framework, CBART captures the nonlinearity in the mean regression (covariates) function, while the Gaussian processes adeptly models the correlation structure within the response. Additionally, given the high flexibility of both CBART and GP models, their combination may lead to identification issues. We provide methods to address these challenges. To demonstrate the effectiveness of CBART and GP-CBART, we present corresponding simulated and real-world examples. △ Less

Submitted 30 November, 2023; originally announced November 2023.

arXiv:2311.17235 [pdf]

doi 10.2458/azu_uapress_9780816540945-ch012

Photochemistry and Haze Formation

Authors: Mandt K. E., Luspay-Kuti A., Cheng A., Jessup K. -L., Gao P

Abstract: One of the many exciting revelations of the New Horizons flyby of Pluto was the observation of global haze layers at altitudes as high as 200 km in the visible wavelengths. This haze is produced in the upper atmosphere through photochemical processes, similar to the processes in Titan's atmosphere. As the haze particles grow in size and descend to the lower atmosphere, they coagulate and interact… ▽ More One of the many exciting revelations of the New Horizons flyby of Pluto was the observation of global haze layers at altitudes as high as 200 km in the visible wavelengths. This haze is produced in the upper atmosphere through photochemical processes, similar to the processes in Titan's atmosphere. As the haze particles grow in size and descend to the lower atmosphere, they coagulate and interact with the gases in the atmosphere through condensation and sticking processes that serve as temporary and permanent loss processes. New Horizons observations confirm studies of Titan haze analogs suggesting that photochemically produced haze particles harden as they grow in size. We outline in this chapter what is known about the photochemical processes that lead to haze production and outline feedback processes resulting from the presence of haze in the atmosphere, connect this to the evolution of Pluto's atmosphere, and discuss open questions that need to be addressed in future work. △ Less

Submitted 28 November, 2023; originally announced November 2023.

MSC Class: 85-01

Journal ref: In Pluto System After New Horizons (S. A. Stern, R. P. Binzel, W. M. Grundy, J. M. Moore, and L. A. Young, eds.), Univ. of Arizona, Tucson (2021)

arXiv:2310.18743 [pdf, other]

Optimization of utility-based shortfall risk: A non-asymptotic viewpoint

Authors: Sumedh Gupte, Prashanth L. A., Sanjay P. Bhat

Abstract: We consider the problems of estimation and optimization of utility-based shortfall risk (UBSR), which is a popular risk measure in finance. In the context of UBSR estimation, we derive a non-asymptotic bound on the mean-squared error of the classical sample average approximation (SAA) of UBSR. Next, in the context of UBSR optimization, we derive an expression for the UBSR gradient under a smooth p… ▽ More We consider the problems of estimation and optimization of utility-based shortfall risk (UBSR), which is a popular risk measure in finance. In the context of UBSR estimation, we derive a non-asymptotic bound on the mean-squared error of the classical sample average approximation (SAA) of UBSR. Next, in the context of UBSR optimization, we derive an expression for the UBSR gradient under a smooth parameterization. This expression is a ratio of expectations, both of which involve the UBSR. We use SAA for the numerator as well as denominator in the UBSR gradient expression to arrive at a biased gradient estimator. We derive non-asymptotic bounds on the estimation error, which show that our gradient estimator is asymptotically unbiased. We incorporate the aforementioned gradient estimator into a stochastic gradient (SG) algorithm for UBSR optimization. Finally, we derive non-asymptotic bounds that quantify the rate of convergence of our SG algorithm for UBSR optimization. △ Less

Submitted 30 March, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

arXiv:2308.16386 [pdf]

RGB-T Tracking via Multi-Modal Mutual Prompt Learning

Authors: Yang Luo, Xiqing Guo, Hui Feng, Lei Ao

Abstract: Object tracking based on the fusion of visible and thermal im-ages, known as RGB-T tracking, has gained increasing atten-tion from researchers in recent years. How to achieve a more comprehensive fusion of information from the two modalities with fewer computational costs has been a problem that re-searchers have been exploring. Recently, with the rise of prompt learning in computer vision, we can… ▽ More Object tracking based on the fusion of visible and thermal im-ages, known as RGB-T tracking, has gained increasing atten-tion from researchers in recent years. How to achieve a more comprehensive fusion of information from the two modalities with fewer computational costs has been a problem that re-searchers have been exploring. Recently, with the rise of prompt learning in computer vision, we can better transfer knowledge from visual large models to downstream tasks. Considering the strong complementarity between visible and thermal modalities, we propose a tracking architecture based on mutual prompt learning between the two modalities. We also design a lightweight prompter that incorporates attention mechanisms in two dimensions to transfer information from one modality to the other with lower computational costs, embedding it into each layer of the backbone. Extensive ex-periments have demonstrated that our proposed tracking ar-chitecture is effective and efficient, achieving state-of-the-art performance while maintaining high running speeds. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 9 pages, 5 figures, 5 tables

arXiv:2307.16495 [pdf]

doi 10.1016/j.tsep.2023.102333

Through-chip microchannels for three-dimensional integrated circuits cooling

Authors: Lihong Ao, Aymeric Ramiere

Abstract: Cooling high-power electronics in multilayer integrated circuits (ICs) is challenging for existing cooling methods. In this work, we designed through-chip microchannels (TCMCs) that cross the entire chip perpendicularly to the layers, with water circulating inside to provide direct cooling to each layer. TCMCs are organized in a square array where the pitch and radius of the microchannels are expl… ▽ More Cooling high-power electronics in multilayer integrated circuits (ICs) is challenging for existing cooling methods. In this work, we designed through-chip microchannels (TCMCs) that cross the entire chip perpendicularly to the layers, with water circulating inside to provide direct cooling to each layer. TCMCs are organized in a square array where the pitch and radius of the microchannels are explored. Our computational fluid dynamics (CFD) simulations show that a pitch 10 μm and a radius 1 μm optimize the cooling performance to support a power higher than 10^4 W/cm2 while the maximum temperature rise remains below 60 K with a water inlet temperature of 300 K. We show that the cooling properties do not change with the number of layers for a given chip thickness which provides flexibility to the functional design of the chip. Though manufacturing may be challenging, TCMCs offer a new way for chip cooling that could provide a leap forward in the performance of multilayer 3D ICs and high-power electronics. △ Less

Submitted 19 December, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

Journal ref: Thermal Science and Engineering Progress, 102333 (2024)

arXiv:2306.09854 [pdf, ps, other]

doi 10.1051/0004-6361/202346503

Searching for Milky Way twins: Radial abundance distribution as a strict criterion

Authors: Pilyugin L. S., Tautvaisiene G., Lara-Lopez M. A

Abstract: We search for Milky Way-like galaxies among a sample of approximately 500 galaxies. The characteristics we considered of the candidate galaxies are the following: stellar mass M_star, optical radius R_25, rotation velocity V_rot, central oxygen abundance (O/H)_0, and abundance at the optical radius (O/H)_R25. If the values of R_25 and M_star of the galaxy were close to that of the Milky Way, then… ▽ More We search for Milky Way-like galaxies among a sample of approximately 500 galaxies. The characteristics we considered of the candidate galaxies are the following: stellar mass M_star, optical radius R_25, rotation velocity V_rot, central oxygen abundance (O/H)_0, and abundance at the optical radius (O/H)_R25. If the values of R_25 and M_star of the galaxy were close to that of the Milky Way, then the galaxy was referred to as a structural Milky Way analogue (sMWA). The oxygen abundance at a given radius of a galaxy is defined by the evolution of that region, and we then assumed that the similarity of (O/H)_0 and (O/H)_R25 in two galaxies suggests a similarity in their evolution. If the values of (O/H)_0 and (O/H)_R25 in the galaxy were close to that of the Milky Way, then the galaxy was referred to as an evolutionary Milky Way analogue (eMWA). If the galaxy was simultaneously an eMWA and sMWA, then the galaxy was considered a Milky Way twin. We find that the position of the Milky Way on the (O/H)_0 - (O/H)_R25 diagram shows a large deviation from the general trend in the sense that the (O/H)_R25 in the Milky Way is appreciably lower than in other galaxies of similar (O/H)_0. This feature of the Milky Way evidences that its (chemical) evolution is not typical. We identify four galaxies (NGC~3521, NGC~4651, NGC~2903, and MaNGA galaxy M-8341-09101) that are simultaneously sMWA and eMWA and can therefore be considered as Milky Way twins. In previous studies, Milky Way-like galaxies were selected using structural and morphological characteristics, that is, sMWAs were selected. We find that the abundances at the centre and at the optical radius (evolutionary characteristics) provide a stricter criterion for selecting real Milky Way twins △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: Accepted to Astronomy and Astrophysics, 28 pages, 13 figures

Journal ref: A&A 676, A57 (2023)

arXiv:2304.10951 [pdf, ps, other]

A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning

Authors: Mizhaan Prajit Maniyar, Akash Mondal, Prashanth L. A., Shalabh Bhatnagar

Abstract: We consider the problem of control in the setting of reinforcement learning (RL), where model information is not available. Policy gradient algorithms are a popular solution approach for this problem and are usually shown to converge to a stationary point of the value function. In this paper, we propose two policy Newton algorithms that incorporate cubic regularization. Both algorithms employ the… ▽ More We consider the problem of control in the setting of reinforcement learning (RL), where model information is not available. Policy gradient algorithms are a popular solution approach for this problem and are usually shown to converge to a stationary point of the value function. In this paper, we propose two policy Newton algorithms that incorporate cubic regularization. Both algorithms employ the likelihood ratio method to form estimates of the gradient and Hessian of the value function using sample trajectories. The first algorithm requires an exact solution of the cubic regularized problem in each iteration, while the second algorithm employs an efficient gradient descent-based approximation to the cubic regularized problem. We establish convergence of our proposed algorithms to a second-order stationary point (SOSP) of the value function, which results in the avoidance of traps in the form of saddle points. In particular, the sample complexity of our algorithms to find an $ε$-SOSP is $O(ε^{-3.5})$, which is an improvement over the state-of-the-art sample complexity of $O(ε^{-4.5})$. △ Less

Submitted 21 April, 2023; originally announced April 2023.

arXiv:2211.05910 [pdf, other]

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, **gang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, **woo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper. △ Less

Submitted 7 November, 2022; originally announced November 2022.

Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

arXiv:2210.05918 [pdf, ps, other]

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

Authors: Gandharv Patil, Prashanth L. A., Dheeraj Nagaraj, Doina Precup

Abstract: We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm when combined with tail-averaging. We derive finite time bounds on the parameter error of the tail-averaged TD iterate under a step-size choice that does not require information about the eigenvalues of the matrix underlying the projected TD fixed point. Our analysis shows that tail-averaged TD converges… ▽ More We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm when combined with tail-averaging. We derive finite time bounds on the parameter error of the tail-averaged TD iterate under a step-size choice that does not require information about the eigenvalues of the matrix underlying the projected TD fixed point. Our analysis shows that tail-averaged TD converges at the optimal $O\left(1/t\right)$ rate, both in expectation and with high probability. In addition, our bounds exhibit a sharper rate of decay for the initial error (bias), which is an improvement over averaging all iterates. We also propose and analyse a variant of TD that incorporates regularisation. From analysis, we conclude that the regularised version of TD is useful for problems with ill-conditioned features. △ Less

Submitted 11 September, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Journal ref: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, 2023

arXiv:2209.13967 [pdf, ps, other]

doi 10.1051/0004-6361/202244231

Calibration-based abundances in the interstellar gas of galaxies from slit and IFU spectra

Authors: Pilyugin L. S., Lara-Lopez M. A., Vilchez J. M., Duarte Puertas S., Zinchenko I. A., Dors O. L

Abstract: In this work we make use of available Integral Field Unit (IFU) spectroscopy and slit spectra of several nearby galaxies. The pre-existing empirical R and S calibrations for abundance determinations are constructed using a sample of HII regions with high quality slit spectra. In this paper, we test the applicability of those calibrations to the IFU spectra. We estimate the calibration-based abunda… ▽ More In this work we make use of available Integral Field Unit (IFU) spectroscopy and slit spectra of several nearby galaxies. The pre-existing empirical R and S calibrations for abundance determinations are constructed using a sample of HII regions with high quality slit spectra. In this paper, we test the applicability of those calibrations to the IFU spectra. We estimate the calibration-based abundances obtained using both the IFU and the slit spectroscopy for eight nearby galaxies. The median values of the slit and IFU spectra-based abundances in bins of 0.1 in fractional radius Rg (normalized to the optical radius) of a galaxy are determined and compared. We find that the IFU and the slit spectra-based abundances obtained through the R calibration are close to each other, the mean value of the differences of abundances is 0.005 dex and the scatter in the differences is 0.037 dex for 38 datapoints. The S calibration can produce systematically underestimated values of the IFU spectra-based abundances at high metallicities, the mean value of the differences is -0.059 dex for 21 datapoints, while at lower metallicities the mean value of the differences is -0.018 dex and the scatter is 0.045 dex for 36 data points. This evidences that the R calibration produces more consistent abundance estimations between the slit and the IFU spectra than the S calibration. We find that the same calibration can produce close estimations of the abundances using IFU spectra obtained with different spatial resolution and different spatial samplings. This is in line with the recent finding that the contribution of the diffuse ionized gas to the large aperture spectra of HII regions has a secondary effect. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Comments: 15 pages, 14 figures, accepted to the Astronomy and Astrophysics

Journal ref: A&A 668, A5 (2022)

arXiv:2208.04571 [pdf, other]

doi 10.1016/j.jheap.2022.05.001

The ASTRI Mini-Array of Cherenkov Telescopes at the Observatorio del Teide

Authors: Scuderi S., Giuliani A., Pareschi G., Tosti G., Catalano O., Amato E., Antonelli L. A., Becerra Gonzáles J., Bellassai G., Bigongiari, C., Biondo B., Böttcher M., Bonanno G., Bonnoli G., Bruno P., Bulgarelli A., Canestrari R., Capalbi M., Caraveo P., Cardillo M., Conforti V., Contino G., Corpora M., Costa A. , et al. (73 additional authors not shown)

Abstract: The ASTRI Mini-Array (MA) is an INAF project to build and operate a facility to study astronomical sources emitting at very high-energy in the TeV spectral band. The ASTRI MA consists of a group of nine innovative Imaging Atmospheric Cherenkov telescopes. The telescopes will be installed at the Teide Astronomical Observatory of the Instituto de Astrofisica de Canarias (IAC) in Tenerife (Canary Isl… ▽ More The ASTRI Mini-Array (MA) is an INAF project to build and operate a facility to study astronomical sources emitting at very high-energy in the TeV spectral band. The ASTRI MA consists of a group of nine innovative Imaging Atmospheric Cherenkov telescopes. The telescopes will be installed at the Teide Astronomical Observatory of the Instituto de Astrofisica de Canarias (IAC) in Tenerife (Canary Islands, Spain) on the basis of a host agreement with INAF. Thanks to its expected overall performance, better than those of current Cherenkov telescopes' arrays for energies above \sim 5 TeV and up to 100 TeV and beyond, the ASTRI MA will represent an important instrument to perform deep observations of the Galactic and extra-Galactic sky at these energies. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: 19 pages, 22 figures

Journal ref: Journal of High Energy Astrophysics, Volume 35, p. 52-68 (2022)

arXiv:2208.00290 [pdf, ps, other]

A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random Perturbations for Stochastic Optimization

Authors: Akash Mondal, Prashanth L. A., Shalabh Bhatnagar

Abstract: In this paper, we present a stochastic gradient algorithm for minimizing a smooth objective function that is an expectation over noisy cost samples, and only the latter are observed for any given parameter. Our algorithm employs a gradient estimation scheme with random perturbations, which are formed using the truncated Cauchy distribution from the delta sphere. We analyze the bias and variance of… ▽ More In this paper, we present a stochastic gradient algorithm for minimizing a smooth objective function that is an expectation over noisy cost samples, and only the latter are observed for any given parameter. Our algorithm employs a gradient estimation scheme with random perturbations, which are formed using the truncated Cauchy distribution from the delta sphere. We analyze the bias and variance of the proposed gradient estimator. Our algorithm is found to be particularly useful in the case when the objective function is non-convex, and the parameter dimension is high. From an asymptotic convergence analysis, we establish that our algorithm converges almost surely to the set of stationary points of the objective function and obtains the asymptotic convergence rate. We also show that our algorithm avoids unstable equilibria, implying convergence to local minima. Further, we perform a non-asymptotic convergence analysis of our algorithm. In particular, we establish here a non-asymptotic bound for finding an epsilon-stationary point of the non-convex objective function. Finally, we demonstrate numerically through simulations that the performance of our algorithm outperforms GSF, SPSA, and RDSA by a significant margin over a few non-convex settings and further validate its performance over convex (noisy) objectives. △ Less

Submitted 30 June, 2023; v1 submitted 30 July, 2022; originally announced August 2022.

arXiv:2205.05843 [pdf, ps, other]

A Survey of Risk-Aware Multi-Armed Bandits

Authors: Vincent Y. F. Tan, Prashanth L. A., Krishna Jagannathan

Abstract: In several applications such as clinical trials and financial portfolio optimization, the expected value (or the average reward) does not satisfactorily capture the merits of a drug or a portfolio. In such applications, risk plays a crucial role, and a risk-aware performance measure is preferable, so as to capture losses in the case of adverse events. This survey aims to consolidate and summarise… ▽ More In several applications such as clinical trials and financial portfolio optimization, the expected value (or the average reward) does not satisfactorily capture the merits of a drug or a portfolio. In such applications, risk plays a crucial role, and a risk-aware performance measure is preferable, so as to capture losses in the case of adverse events. This survey aims to consolidate and summarise the existing research on risk measures, specifically in the context of multi-armed bandits. We review various risk measures of interest, and comment on their properties. Next, we review existing concentration inequalities for various risk measures. Then, we proceed to defining risk-aware bandit problems, We consider algorithms for the regret minimization setting, where the exploration-exploitation trade-off manifests, as well as the best-arm identification setting, which is a pure exploration problem -- both in the context of risk-sensitive measures. We conclude by commenting on persisting challenges and fertile areas for future research. △ Less

Submitted 11 May, 2022; originally announced May 2022.

Comments: 11 pages; Unabridged version of a a survey paper of the same title accepted to IJCAI-ECAI, 2022

arXiv:2204.11026 [pdf]

Bioinformatic analysis for structure and function of Glutamine synthetase(GS)

Authors: Jiahao Ma, Guotong Xu, Le Ao, Siqi Chen, **gze Liu

Abstract: Objective: To predict structure and function of Glutamine synthetase (GS) from Pseudoalteromonas sp. by bioinformatics technology, and to provide a theoretical basis for further study. Methods: Open reading frame (ORF) of GS sequence from Pseudoalteromonas sp. was obtained by ORF finder and was translated into amino acid residue. The structure domain was analyzed by Blast. By the method of analysi… ▽ More Objective: To predict structure and function of Glutamine synthetase (GS) from Pseudoalteromonas sp. by bioinformatics technology, and to provide a theoretical basis for further study. Methods: Open reading frame (ORF) of GS sequence from Pseudoalteromonas sp. was obtained by ORF finder and was translated into amino acid residue. The structure domain was analyzed by Blast. By the method of analysis tools: Protparam, ProtScale, SignalP-4.0, TMHMM, SOPMA, SWISS-MODEL, NCBI SMART-BLAST and MAGA 7.0, the structure and function of the protein were predicted and analyzed. Results: The results showed that the sequence was GS with 468 amino acid residues, theoretical molecular weight was 51986.64 Da. The protein has the closest evolutionary status with Shewanella oneidensis. Then it had no signal peptide site and transmembrane domain. Secondary structure of GS contained 35.04% alpha-helix, 16.67% Extended chain, 5.34% beta-turn, 42.95% RandomCoil. Conclusions: This GU was a variety of biological functions of protein that may be used as a molecular samples of microbial nitrogen metabolism in extreme environments. △ Less

Submitted 23 April, 2022; originally announced April 2022.

Comments: 8 pages, 8 figures

arXiv:2202.11046 [pdf, other]

A policy gradient approach for optimization of smooth risk measures

Authors: Nithia Vijayan, Prashanth L. A

Abstract: We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using the broad class of smooth risk measures of the cumulative discounted reward. We propose two template policy gradient algorithms that optimize a smooth risk measure in on-policy an… ▽ More We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using the broad class of smooth risk measures of the cumulative discounted reward. We propose two template policy gradient algorithms that optimize a smooth risk measure in on-policy and off-policy RL settings, respectively. We derive non-asymptotic bounds that quantify the rate of convergence of our proposed algorithms to a stationary point of the smooth risk measure. As special cases, we establish that our algorithms apply to optimization of mean-variance and distortion risk measures, respectively. △ Less

Submitted 23 June, 2024; v1 submitted 22 February, 2022; originally announced February 2022.

Comments: arXiv admin note: text overlap with arXiv:2107.04422

arXiv:2201.00200 [pdf, other]

doi 10.1051/0004-6361/202142666

Local heating due to convective overshooting and the solar modelling problem

Authors: Baraffe I, Constantino T, Clarke J, Le Saux A, Goffrey T, Guillet T, Pratt J, Vlaykov D. G

Abstract: Recent hydrodynamical simulations of convection in a solar-like model suggest that penetrative convective flows at the boundary of the convective envelope modify the thermal background in the overshooting layer. Based on these results, we implement in one-dimensional stellar evolution codes a simple prescription to modify the temperature gradient below the convective boundary of a solar model. Thi… ▽ More Recent hydrodynamical simulations of convection in a solar-like model suggest that penetrative convective flows at the boundary of the convective envelope modify the thermal background in the overshooting layer. Based on these results, we implement in one-dimensional stellar evolution codes a simple prescription to modify the temperature gradient below the convective boundary of a solar model. This simple prescription qualitatively reproduces the behaviour found in the hydrodynamical simulations, namely a local heating and smoothing of the temperature gradient below the convective boundary. We show that introducing local heating in the overshooting layer can reduce the sound-speed discrepancy usually reported between solar models and the structure of the Sun inferred from helioseismology. It also affects key quantities in the convective envelope, such as the density, the entropy, and the speed of sound. These effects could help reduce the discrepancies between solar models and observed constraints based on seismic inversions of the Ledoux discriminant. Since mixing due to overshooting and local heating are the result of the same convective penetration process, the goal of this work is to invite solar modellers to consider both processes for a more consistent approach. △ Less

Submitted 1 January, 2022; originally announced January 2022.

Comments: 7 pages, 4 figures, accepted for publication in A&A

Journal ref: A&A 659, A53 (2022)

arXiv:2111.00481 [pdf, other]

Measurements of thermal relaxation of the OGRAN underground setup

Authors: Gavrilyuk Y. M., Gusev A. V., Kvashnin N. L., Lugovoy A. A., Oreshkin S. I., Popov S. M., Rudenko V. N., Semenov V. V., Syrovatsky I. A

Abstract: An upgraded version of the OGRAN -- combined optical-acoustic gravitational wave detector -- has been investigated in a long-term operation mode. This installation, located at the Baksan Neutrino Observatory (BNO) INR RAS, is designed to work under the program for detecting collapsing stars in parallel with the neutrino detector: Baksan Underground Scintillation Telescope (BUST). Such joint search… ▽ More An upgraded version of the OGRAN -- combined optical-acoustic gravitational wave detector -- has been investigated in a long-term operation mode. This installation, located at the Baksan Neutrino Observatory (BNO) INR RAS, is designed to work under the program for detecting collapsing stars in parallel with the neutrino detector: Baksan Underground Scintillation Telescope (BUST). Such joint search corresponds to the modern trend for a development of "multi-messenger astronomy". In this work the effects of thermal relaxation OGRAN are experimentally investigated using passive and active thermal stabilization systems in the underground laboratory BNO PK-14. △ Less

Submitted 31 October, 2021; originally announced November 2021.

arXiv:2110.10263 [pdf, other]

doi 10.1117/12.2597170

Current status of PAPYRUS : the pyramid based adaptive optics system at LAM/OHP

Authors: Muslimov E., Levraud N., Chambouleyron V., Boudjema I., Lau A., Caillat A., Pedreros F., Otten G., El Hadi K., Joaquina K., Lopez M., El Morsy M., Beltramo Martin O., Fetick R., Ke Z., Sauvage J-F., Neichel B., Fusco T., Schmitt J., Le Van Suu A., Charton J., Schimpf A., Martin B., Dintrono F., Esposito S. , et al. (1 additional authors not shown)

Abstract: The Provence Adaptive optics Pyramid Run System (PAPYRUS) is a pyramid-based Adaptive Optics (AO) system that will be installed at the Coude focus of the 1.52m telescope (T152) at the Observatoire de Haute Provence (OHP). The project is being developed by PhD students and Postdocs across France with support from staff members consolidating the existing expertise and hardware into an R&D testbed. T… ▽ More The Provence Adaptive optics Pyramid Run System (PAPYRUS) is a pyramid-based Adaptive Optics (AO) system that will be installed at the Coude focus of the 1.52m telescope (T152) at the Observatoire de Haute Provence (OHP). The project is being developed by PhD students and Postdocs across France with support from staff members consolidating the existing expertise and hardware into an R&D testbed. This testbed allows us to run various pyramid wavefront sensing (WFS) control algorithms on-sky and experiment on new concepts for wavefront control with additional benefit from the high number of available nights at this telescope. It will also function as a teaching tool for students during the planned AO summer school at OHP. To our knowledge, this is one of the first pedagogic pyramid-based AO systems on-sky. The key components of PAPYRUS are a 17x17 actuators Alpao deformable mirror with a Alpao RTC, a very low noise camera OCAM2k, and a 4-faces glass pyramid. PAPYRUS is designed in order to be a simple and modular system to explore wavefront control with a pyramid WFS on sky. We present an overview of PAPYRUS, a description of the opto-mechanical design and the current status of the project. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: 19 pages, 11 figures

Journal ref: Proc. SPIE 11876, Optical Instrument Science, Technology, and Applications II, 118760H (24 September 2021);

arXiv:2107.04422 [pdf, other]

Policy Gradient Methods for Distortion Risk Measures

Authors: Nithia Vijayan, Prashanth L. A

Abstract: We propose policy gradient algorithms which learn risk-sensitive policies in a reinforcement learning (RL) framework. Our proposed algorithms maximize the distortion risk measure (DRM) of the cumulative reward in an episodic Markov decision process in on-policy and off-policy RL settings, respectively. We derive a variant of the policy gradient theorem that caters to the DRM objective, and integra… ▽ More We propose policy gradient algorithms which learn risk-sensitive policies in a reinforcement learning (RL) framework. Our proposed algorithms maximize the distortion risk measure (DRM) of the cumulative reward in an episodic Markov decision process in on-policy and off-policy RL settings, respectively. We derive a variant of the policy gradient theorem that caters to the DRM objective, and integrate it with a likelihood ratio-based gradient estimation scheme. We derive non-asymptotic bounds that establish the convergence of our proposed algorithms to an approximate stationary point of the DRM objective. △ Less

Submitted 4 February, 2024; v1 submitted 9 July, 2021; originally announced July 2021.

arXiv:2106.11331 [pdf, other]

Exploiting timing capabilities of the CHEOPS mission with warm-Jupiter planets

Authors: Borsato L, Piotto G, Gandolfi D, Nascimbeni V, Lacedelli G, Marzari F, Billot N, Maxted P, Sousa S G, Cameron A C, Bonfanti A, Wilson T, Serrano L, Garai Z, Alibert Y, Alonso R, Asquier J, Bárczy T, Bandy T, Barrado D, Barros S C, Baumjohann W, Beck M, Beck T, Benz W , et al. (53 additional authors not shown)

Abstract: We present 17 transit light curves of seven known warm-Jupiters observed with the CHaracterising ExOPlanet Satellite (CHEOPS). The light curves have been collected as part of the CHEOPS Guaranteed Time Observation (GTO) program that searches for transit-timing variation (TTV) of warm-Jupiters induced by a possible external perturber to shed light on the evolution path of such planetary systems. We… ▽ More We present 17 transit light curves of seven known warm-Jupiters observed with the CHaracterising ExOPlanet Satellite (CHEOPS). The light curves have been collected as part of the CHEOPS Guaranteed Time Observation (GTO) program that searches for transit-timing variation (TTV) of warm-Jupiters induced by a possible external perturber to shed light on the evolution path of such planetary systems. We describe the CHEOPS observation process, from the planning to the data analysis. In this work we focused on the timing performance of CHEOPS, the impact of the sampling of the transit phases, and the improvement we can obtain combining multiple transits together. We reached the highest precision on the transit time of about 13-16 s for the brightest target (WASP-38, G = 9.2) in our sample. From the combined analysis of multiple transits of fainter targets with G >= 11 we obtained a timing precision of about 2 min. Additional observations with CHEOPS, covering a longer temporal baseline, will further improve the precision on the transit times and will allow us to detect possible TTV signals induced by an external perturber. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: 23 pages, 19 figures, 8 tables. Accepted for publication in MNRAS

arXiv:2101.02137 [pdf, other]

Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint

Authors: Nithia Vijayan, Prashanth L. A

Abstract: We propose two policy gradient algorithms for solving the problem of control in an off-policy reinforcement learning (RL) context. Both algorithms incorporate a smoothed functional (SF) based gradient estimation scheme. The first algorithm is a straightforward combination of importance sampling-based off-policy evaluation with SF-based gradient estimation. The second algorithm, inspired by the sto… ▽ More We propose two policy gradient algorithms for solving the problem of control in an off-policy reinforcement learning (RL) context. Both algorithms incorporate a smoothed functional (SF) based gradient estimation scheme. The first algorithm is a straightforward combination of importance sampling-based off-policy evaluation with SF-based gradient estimation. The second algorithm, inspired by the stochastic variance-reduced gradient (SVRG) algorithm, incorporates variance reduction in the update iteration. For both algorithms, we derive non-asymptotic bounds that establish convergence to an approximate stationary point. From these results, we infer that the first algorithm converges at a rate that is comparable to the well-known REINFORCE algorithm in an off-policy RL context, while the second algorithm exhibits an improved rate of convergence. △ Less

Submitted 23 June, 2024; v1 submitted 6 January, 2021; originally announced January 2021.

arXiv:2011.14280 [pdf, other]

A Novel Sentiment Analysis Engine for Preliminary Depression Status Estimation on Social Media

Authors: Sudhir Kumar Suman, Hrithwik Shalu, Lakshya A Agrawal, Archit Agrawal, Juned Kadiwala

Abstract: Text sentiment analysis for preliminary depression status estimation of users on social media is a widely exercised and feasible method, However, the immense variety of users accessing the social media websites and their ample mix of vocabularies makes it difficult for commonly applied deep learning-based classifiers to perform. To add to the situation, the lack of adaptability of traditional supe… ▽ More Text sentiment analysis for preliminary depression status estimation of users on social media is a widely exercised and feasible method, However, the immense variety of users accessing the social media websites and their ample mix of vocabularies makes it difficult for commonly applied deep learning-based classifiers to perform. To add to the situation, the lack of adaptability of traditional supervised machine learning could hurt at many levels. We propose a cloud-based smartphone application, with a deep learning-based backend to primarily perform depression detection on Twitter social media. The backend model consists of a RoBERTa based siamese sentence classifier that compares a given tweet (Query) with a labeled set of tweets with known sentiment ( Standard Corpus ). The standard corpus is varied over time with expert opinion so as to improve the model's reliability. A psychologist ( with the patient's permission ) could leverage the application to assess the patient's depression status prior to counseling, which provides better insight into the mental health status of a patient. In addition, to the same, the psychologist could be referred to cases of similar characteristics, which could in turn help in more effective treatment. We evaluate our backend model after fine-tuning it on a publicly available dataset. The find tuned model is made to predict depression on a large set of tweet samples with random noise factors. The model achieved pinnacle results, with a testing accuracy of 87.23% and an AUC of 0.8621. △ Less

Submitted 28 November, 2020; originally announced November 2020.

arXiv:2011.06273 [pdf, ps, other]

Algebraic properties of summation of exponential Taylor polynomials

Authors: Lingfeng Ao, Shaofang Hong

Abstract: Let $n\ge 1$ be an integer and $e_n(x)$ denote the truncated exponential Taylor polynomial, i.e. $e_{n}(x)=\sum_{i=0}^n\frac{x^i}{i!}$. A well-known theorem of Schur states that the Galois group of $e_n(x)$ over $\Q$ is the alternating group $A_n$ if $n$ is divisible by 4 or the symmetric group $S_n$ otherwise. In this paper, we study algebraic properties of the summation of two truncated exponent… ▽ More Let $n\ge 1$ be an integer and $e_n(x)$ denote the truncated exponential Taylor polynomial, i.e. $e_{n}(x)=\sum_{i=0}^n\frac{x^i}{i!}$. A well-known theorem of Schur states that the Galois group of $e_n(x)$ over $\Q$ is the alternating group $A_n$ if $n$ is divisible by 4 or the symmetric group $S_n$ otherwise. In this paper, we study algebraic properties of the summation of two truncated exponential Taylor polynomials $\E_n(x):=e_n(x)+e_{n-1}(x)$. We show that $\frac{x^n}{n!}+\sum_{i=0}^{n-1}c_i\frac{x^i}{i!}$ with all $c_i \ (0\le i\le n-1)$ being integers is irreducible over $\Q$ if either $c_0=\pm 1$, or $n$ is not a positive power of $2$ but $|c_0|$ is a positive power of 2. This extends another theorem of Schur. We show also that $\E_n(x)$ is irreducible if $n\not\in\{2,4\}$. Furthermore, we show that ${\rm Gal}_{\Q}(\E_n)$ contains $A_{n}$ except for $n=4$, in which case, ${\rm Gal}_{\Q}(\E_4)=S_3$. Finally, we show that the Galois group ${\rm Gal}_{\Q}(\E_n)$ is $S_n$ if $n\equiv 3 \pmod 4$, or if $n$ is even and $v_p(n!)$ is odd for a prime divisor of $n-1$, or if $n\equiv 1\pmod 4$ and $n-2$ equals the product of an odd prime number $p$ which is coprime to $\sum_{i=1}^{p-1}2^{p-1-i}i!$ and a positive integer coprime to $p$. △ Less

Submitted 12 November, 2020; originally announced November 2020.

Comments: 14 pages

arXiv:2011.05163 [pdf, other]

Amadeus: Scalable, Privacy-Preserving Live Video Analytics

Authors: Sandeep Dsouza, Victor Bahl, Lixiang Ao, Landon P. Cox

Abstract: Smart-city applications ranging from traffic management to public-safety alerts rely on live analytics of video from surveillance cameras in public spaces. However, a growing number of government regulations stipulate how data collected from these cameras must be handled in order to protect citizens' privacy. This paper describes Amadeus, which balances privacy and utility by redacting video in ne… ▽ More Smart-city applications ranging from traffic management to public-safety alerts rely on live analytics of video from surveillance cameras in public spaces. However, a growing number of government regulations stipulate how data collected from these cameras must be handled in order to protect citizens' privacy. This paper describes Amadeus, which balances privacy and utility by redacting video in near realtime for smart-city applications. Our main insight is that whitelisting objects, or blocking by default, is crucial for scalable, privacy-preserving video analytics. In the context of modern object detectors, we prove that whitelisting reduces the risk of an object-detection error leading to a privacy violation, and helps Amadeus scale to a large and diverse set of applications. In particular, Amadeus utilizes whitelisting to generate composable encrypted object-specific live streams, which simultaneously meet the requirements of multiple applications in a privacy-preserving fashion, while reducing the compute and streaming-bandwidth requirements at the edge. Experiments with our Amadeus prototype show that compared to blacklisting objects, whitelisting yields significantly better privacy (up to ~28x) and bandwidth savings (up to ~5.5x). Additionally, our experiments also indicate that the composable live streams generated by Amadeus are usable by real-world applications with minimum utility loss. △ Less

Submitted 6 November, 2020; originally announced November 2020.

Comments: 17 pages, 19 figures

ACM Class: D.4.7

arXiv:2002.11440 [pdf, ps, other]

Non-asymptotic bounds for stochastic optimization with biased noisy gradient oracles

Authors: Nirav Bhavsar, Prashanth L. A

Abstract: We introduce biased gradient oracles to capture a setting where the function measurements have an estimation error that can be controlled through a batch size parameter. Our proposed oracles are appealing in several practical contexts, for instance, risk measure estimation from a batch of independent and identically distributed (i.i.d.) samples, or simulation optimization, where the function measu… ▽ More We introduce biased gradient oracles to capture a setting where the function measurements have an estimation error that can be controlled through a batch size parameter. Our proposed oracles are appealing in several practical contexts, for instance, risk measure estimation from a batch of independent and identically distributed (i.i.d.) samples, or simulation optimization, where the function measurements are `biased' due to computational constraints. In either case, increasing the batch size reduces the estimation error. We highlight the applicability of our biased gradient oracles in a risk-sensitive reinforcement learning setting. In the stochastic non-convex optimization context, we analyze a variant of the randomized stochastic gradient (RSG) algorithm with a biased gradient oracle. We quantify the convergence rate of this algorithm by deriving non-asymptotic bounds on its performance. Next, in the stochastic convex optimization setting, we derive non-asymptotic bounds for the last iterate of a stochastic gradient descent (SGD) algorithm with a biased gradient oracle. △ Less

Submitted 16 May, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:1912.10398 [pdf, other]

Estimation of Spectral Risk Measures

Authors: Ajay Kumar Pandey, Prashanth L. A., Sanjay P. Bhat

Abstract: We consider the problem of estimating a spectral risk measure (SRM) from i.i.d. samples, and propose a novel method that is based on numerical integration. We show that our SRM estimate concentrates exponentially, when the underlying distribution has bounded support. Further, we also consider the case when the underlying distribution is either Gaussian or exponential, and derive a concentration bo… ▽ More We consider the problem of estimating a spectral risk measure (SRM) from i.i.d. samples, and propose a novel method that is based on numerical integration. We show that our SRM estimate concentrates exponentially, when the underlying distribution has bounded support. Further, we also consider the case when the underlying distribution is either Gaussian or exponential, and derive a concentration bound for our estimation scheme. We validate the theoretical findings on a synthetic setup, and in a vehicular traffic routing application. △ Less

Submitted 22 December, 2019; originally announced December 2019.

arXiv:1904.00092 [pdf, other]

$\mathcal{PT}$-symmetric tight-binding model with asymmetric couplings

Authors: Moreno-Rodríguez L. A., Izrailev F. M., Méndez-Bermúdez J. A

Abstract: We study spectral and transport properties of one-dimensional tight-binding $\mathcal{PT}$-symmetric chains with alternating couplings. Based on the transfer matrix method, we have analytically developed the expressions for the transmission and reflection coefficients for any values of control parameters. These expressions are obtained in a very compact form which separately imbed the generic ener… ▽ More We study spectral and transport properties of one-dimensional tight-binding $\mathcal{PT}$-symmetric chains with alternating couplings. Based on the transfer matrix method, we have analytically developed the expressions for the transmission and reflection coefficients for any values of control parameters. These expressions are obtained in a very compact form which separately imbed the generic energy dependence valid for any periodic structure, as well as specific properties of a unit cell composing the scattering setup. Out main interest is in specific properties of the left/right reflections that are due to the $\mathcal{PT}$ symmetric structure of the model. We have found that for the case of asymmetric couplings between dimers, a new type of specific points emerge in the spectrum, which are responsible for quite specific properties of the unidirectional reflectivity. △ Less

Submitted 29 March, 2019; originally announced April 2019.

arXiv:1902.10709 [pdf, ps, other]

A Wasserstein distance approach for concentration of empirical risk estimates

Authors: Prashanth L. A., Sanjay P. Bhat

Abstract: This paper presents a unified approach based on Wasserstein distance to derive concentration bounds for empirical estimates for two broad classes of risk measures defined in the paper. The classes of risk measures introduced include as special cases well known risk measures from the finance literature such as conditional value at risk (CVaR), optimized certainty equivalent risk, spectral risk meas… ▽ More This paper presents a unified approach based on Wasserstein distance to derive concentration bounds for empirical estimates for two broad classes of risk measures defined in the paper. The classes of risk measures introduced include as special cases well known risk measures from the finance literature such as conditional value at risk (CVaR), optimized certainty equivalent risk, spectral risk measures, utility-based shortfall risk, cumulative prospect theory (CPT) value, rank dependent expected utility and distorted risk measures. Two estimation schemes are considered, one for each class of risk measures. One estimation scheme involves applying the risk measure to the empirical distribution function formed from a collection of i.i.d. samples of the random variable (r.v.), while the second scheme involves applying the same procedure to a truncated sample. The bounds provided apply to three popular classes of distributions, namely sub-Gaussian, sub-exponential and heavy-tailed distributions. The bounds are derived by first relating the estimation error to the Wasserstein distance between the true and empirical distributions, and then using recent concentration bounds for the latter. Previous concentration bounds are available only for specific risk measures such as CVaR and CPT-value. The bounds derived in this paper are shown to either match or improve upon previous bounds in cases where they are available. The usefulness of the bounds is illustrated through an algorithm and the corresponding regret bound for a stochastic bandit problem involving a general risk measure from each of the two classes introduced in the paper. △ Less

Submitted 10 May, 2022; v1 submitted 27 February, 2019; originally announced February 2019.

arXiv:1902.02953 [pdf, ps, other]

Correlated bandits or: How to minimize mean-squared error online

Authors: Vinay Praneeth Boda, Prashanth L. A

Abstract: While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means of the arms. Sensors placement for industrial surveillance and cellular network monitoring are… ▽ More While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means of the arms. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator, based on sample variances and covariances, and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minmax theory, we also derive fundamental performance limits for the correlated bandit problem. △ Less

Submitted 26 June, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

arXiv:1901.05466 [pdf, other]

doi 10.3847/1538-3881/ab8a49

A Wide Orbit Exoplanet OGLE-2012-BLG-0838Lb

Authors: Poleski R., Suzuki D., Udalski A., Xie X., Yee J. C., Koshimoto N., Gaudi B. S., Gould A., Skowron J., Szymanski M. K., Soszynski I., Pietrukowicz P., Kozlowski S., Wyrzykowski L., Ulaczyk K., Abe F., Barry R. K., Bennett D. P., Bhattacharya A., Bond I. A., Donachie M., Fujii H., Fukui A., Itow Y., Hirao Y. , et al. (26 additional authors not shown)

Abstract: We present the discovery of a planet on a very wide orbit in the microlensing event OGLE-2012-BLG-0838. The signal of the planet is well separated from the main peak of the event and the planet-star projected separation is found to be twice larger than the Einstein ring radius, which roughly corresponds to a projected separation of ~4 AU. Similar planets around low-mass stars are very hard to find… ▽ More We present the discovery of a planet on a very wide orbit in the microlensing event OGLE-2012-BLG-0838. The signal of the planet is well separated from the main peak of the event and the planet-star projected separation is found to be twice larger than the Einstein ring radius, which roughly corresponds to a projected separation of ~4 AU. Similar planets around low-mass stars are very hard to find using any technique other than microlensing. We discuss microlensing model fitting in detail and discuss the prospects for measuring the mass and distance of lens system directly. △ Less

Submitted 17 November, 2021; v1 submitted 16 January, 2019; originally announced January 2019.

Comments: 26 pages, 11 figures

Journal ref: Astronomical Journal, Volume 159, Issue 6, id.261, 16 pp. (2020)

arXiv:1901.00997 [pdf, ps, other]

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions

Authors: Prashanth L. A., Krishna Jagannathan, Ravi Kumar Kolla

Abstract: Conditional Value-at-Risk (CVaR) is a widely used risk metric in applications such as finance. We derive concentration bounds for CVaR estimates, considering separately the cases of light-tailed and heavy-tailed distributions. In the light-tailed case, we use a classical CVaR estimator based on the empirical distribution constructed from the samples. For heavy-tailed random variables, we assume a… ▽ More Conditional Value-at-Risk (CVaR) is a widely used risk metric in applications such as finance. We derive concentration bounds for CVaR estimates, considering separately the cases of light-tailed and heavy-tailed distributions. In the light-tailed case, we use a classical CVaR estimator based on the empirical distribution constructed from the samples. For heavy-tailed random variables, we assume a mild `bounded moment' condition, and derive a concentration bound for a truncation-based estimator. Notably, our concentration bounds enjoy an exponential decay in the sample size, for heavy-tailed as well as light-tailed distributions. To demonstrate the applicability of our concentration results, we consider a CVaR optimization problem in a multi-armed bandit setting. Specifically, we address the best CVaR-arm identification problem under a fixed budget. We modify the well-known successive rejects algorithm to incorporate a CVaR-based criterion. Using the CVaR concentration result, we derive an upper-bound on the probability of incorrect identification by the proposed algorithm. △ Less

Submitted 25 August, 2019; v1 submitted 4 January, 2019; originally announced January 2019.

arXiv:1810.09126 [pdf, ps, other]

Risk-Sensitive Reinforcement Learning via Policy Gradient Search

Authors: Prashanth L. A., Michael Fu

Abstract: The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimizes the expected value of a performance metric such as the infinite-horizon cumulative discounted or long-run average cost/reward. In practice, optimizing the expected value alone may not be satisfactory, in that it may be desirable to incorporate the notion of risk into the optimization problem formu… ▽ More The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimizes the expected value of a performance metric such as the infinite-horizon cumulative discounted or long-run average cost/reward. In practice, optimizing the expected value alone may not be satisfactory, in that it may be desirable to incorporate the notion of risk into the optimization problem formulation, either in the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., exponential utility, variance, percentile performance, chance constraints, value at risk (quantile), conditional value-at-risk, prospect theory and its later enhancement, cumulative prospect theory. In this book, we consider risk-sensitive RL in two settings: one where the goal is to find a policy that optimizes the usual expected value objective while ensuring that a risk constraint is satisfied, and the other where the risk measure is the objective. We survey some of the recent work in this area specifically where policy gradient search is the solution approach. In the first risk-sensitive RL setting, we cover popular risk measures based on variance, conditional value-at-risk, and chance constraints, and present a template for policy gradient-based risk-sensitive RL algorithms using a Lagrangian formulation. For the setting where risk is incorporated directly into the objective function, we consider an exponential utility formulation, cumulative prospect theory, and coherent risk measures. This non-exhaustive survey aims to give a flavor of the challenges involved in solving risk-sensitive RL problems using policy gradient methods, as well as outlining some potential future research directions. △ Less

Submitted 23 May, 2022; v1 submitted 22 October, 2018; originally announced October 2018.

Comments: To appear in "Foundations and Trends in Machine Learning"

arXiv:1808.02871 [pdf, ps, other]

Random directions stochastic approximation with deterministic perturbations

Authors: Prashanth L A, Shalabh Bhatnagar, Nirav Bhavsar, Michael Fu, Steven I. Marcus

Abstract: We introduce deterministic perturbation schemes for the recently proposed random directions stochastic approximation (RDSA) [17], and propose new first-order and second-order algorithms. In the latter case, these are the first second-order algorithms to incorporate deterministic perturbations. We show that the gradient and/or Hessian estimates in the resulting algorithms with deterministic perturb… ▽ More We introduce deterministic perturbation schemes for the recently proposed random directions stochastic approximation (RDSA) [17], and propose new first-order and second-order algorithms. In the latter case, these are the first second-order algorithms to incorporate deterministic perturbations. We show that the gradient and/or Hessian estimates in the resulting algorithms with deterministic perturbations are asymptotically unbiased, so that the algorithms are provably convergent. Furthermore, we derive convergence rates to establish the superiority of the first-order and second-order algorithms, for the special case of a convex and quadratic optimization problem, respectively. Numerical experiments are used to validate the theoretical results. △ Less

Submitted 28 March, 2019; v1 submitted 8 August, 2018; originally announced August 2018.

arXiv:1808.01739 [pdf, ps, other]

Concentration bounds for empirical conditional value-at-risk: The unbounded case

Authors: Ravi Kumar Kolla, Prashanth L. A., Sanjay P. Bhat, Krishna Jagannathan

Abstract: In several real-world applications involving decision making under uncertainty, the traditional expected value objective may not be suitable, as it may be necessary to control losses in the case of a rare but extreme event. Conditional Value-at-Risk (CVaR) is a popular risk measure for modeling the aforementioned objective. We consider the problem of estimating CVaR from i.i.d. samples of an unbou… ▽ More In several real-world applications involving decision making under uncertainty, the traditional expected value objective may not be suitable, as it may be necessary to control losses in the case of a rare but extreme event. Conditional Value-at-Risk (CVaR) is a popular risk measure for modeling the aforementioned objective. We consider the problem of estimating CVaR from i.i.d. samples of an unbounded random variable, which is either sub-Gaussian or sub-exponential. We derive a novel one-sided concentration bound for a natural sample-based CVaR estimator in this setting. Our bound relies on a concentration result for a quantile-based estimator for Value-at-Risk (VaR), which may be of independent interest. △ Less

Submitted 6 August, 2018; originally announced August 2018.

arXiv:1708.04847 [pdf, ps, other]

Unnormalized quasi-distributions and tomograms of quantum states

Authors: Man'ko V. I., Markovich L. A

Abstract: Tomograms and quasi-distribution functions like Wigner, Glauber - Sudarshan $P$- and Husimi $Q$- functions that violate the standard normalization condition are considered. Conditions under which a reconstruction of the density matrix using these tomograms and quasi-distribution functions is possible are obtained. Three different examples of states like the de Broglie plane wave, the Moschinsky sh… ▽ More Tomograms and quasi-distribution functions like Wigner, Glauber - Sudarshan $P$- and Husimi $Q$- functions that violate the standard normalization condition are considered. Conditions under which a reconstruction of the density matrix using these tomograms and quasi-distribution functions is possible are obtained. Three different examples of states like the de Broglie plane wave, the Moschinsky shutter problem and the stationary state of the charged particle in the uniform and constant electric field are studied. Their tomograms and quasi-distribution functions expressed in terms of the Dirac delta function, the Airy function and the Fresnel integrals are shown to violate the standard normalization condition and thus the density matrix of the state can not always be reconstructed. △ Less

Submitted 16 August, 2017; originally announced August 2017.

Comments: 19 pages, no figures

arXiv:1611.10283 [pdf, ps, other]

Bandit algorithms to emulate human decision making using probabilistic distortions

Authors: Ravi Kumar Kolla, Prashanth L. A., Aditya Gopalan, Krishna Jagannathan, Michael Fu, Steve Marcus

Abstract: Motivated by models of human decision making proposed to explain commonly observed deviations from conventional expected value preferences, we formulate two stochastic multi-armed bandit problems with distorted probabilities on the reward distributions: the classic $K$-armed bandit and the linearly parameterized bandit settings. We consider the aforementioned problems in the regret minimization as… ▽ More Motivated by models of human decision making proposed to explain commonly observed deviations from conventional expected value preferences, we formulate two stochastic multi-armed bandit problems with distorted probabilities on the reward distributions: the classic $K$-armed bandit and the linearly parameterized bandit settings. We consider the aforementioned problems in the regret minimization as well as best arm identification framework for multi-armed bandits. For the regret minimization setting in $K$-armed as well as linear bandit problems, we propose algorithms that are inspired by Upper Confidence Bound (UCB) algorithms, incorporate reward distortions, and exhibit sublinear regret. For the $K$-armed bandit setting, we derive an upper bound on the expected regret for our proposed algorithm, and then we prove a matching lower bound to establish the order-optimality of our algorithm. For the linearly parameterized setting, our algorithm achieves a regret upper bound that is of the same order as that of regular linear bandit algorithm called Optimism in the Face of Uncertainty Linear (OFUL) bandit algorithm, and unlike OFUL, our algorithm handles distortions and an arm-dependent noise model. For the best arm identification problem in the $K$-armed bandit setting, we propose algorithms, derive guarantees on their performance, and also show that these algorithms are order optimal by proving matching fundamental limits on performance. For best arm identification in linear bandits, we propose an algorithm and establish sample complexity guarantees. Finally, we present simulation experiments which demonstrate the advantages resulting from using distortion-aware learning algorithms in a vehicular traffic routing application. △ Less

Submitted 31 October, 2023; v1 submitted 30 November, 2016; originally announced November 2016.

Comments: The material in this paper was presented in part at the 2017 AAAI Conference on Artificial Intelligence

arXiv:1609.07087 [pdf, other]

(Bandit) Convex Optimization with Biased Noisy Gradient Oracles

Authors: Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári

Abstract: Algorithms for bandit convex optimization and online learning often rely on constructing noisy gradient estimates, which are then used in appropriately adjusted first-order algorithms, replacing actual gradients. Depending on the properties of the function to be optimized and the nature of ``noise'' in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. In t… ▽ More Algorithms for bandit convex optimization and online learning often rely on constructing noisy gradient estimates, which are then used in appropriately adjusted first-order algorithms, replacing actual gradients. Depending on the properties of the function to be optimized and the nature of ``noise'' in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. In this paper we propose a novel framework that replaces the specific gradient estimation methods with an abstract oracle. With the help of the new framework we unify previous works, reproducing their results in a clean and concise fashion, while, perhaps more importantly, the framework also allows us to formally show that to achieve the optimal root-$n$ rate either the algorithms that use existing gradient estimators, or the proof techniques used to analyze them have to go beyond what exists today. △ Less

Submitted 4 July, 2020; v1 submitted 22 September, 2016; originally announced September 2016.

arXiv:1602.07310 [pdf, ps, other]

doi 10.1007/JHEP04(2016)028

f(Lovelock) theories of gravity

Authors: Pablo Bueno, Pablo A. Cano, Oscar Lasso A., Pedro F. Ramirez

Abstract: f(Lovelock) gravities are simple generalizations of the usual f(R) and Lovelock theories in which the gravitational action depends on some arbitrary function of the corresponding dimensionally-extended Euler densities. In this paper we study several aspects of these theories in general dimensions. We start by identifying the generalized boundary term which makes the gravitational variational probl… ▽ More f(Lovelock) gravities are simple generalizations of the usual f(R) and Lovelock theories in which the gravitational action depends on some arbitrary function of the corresponding dimensionally-extended Euler densities. In this paper we study several aspects of these theories in general dimensions. We start by identifying the generalized boundary term which makes the gravitational variational problem well-posed. Then, we show that these theories are equivalent to certain scalar-tensor theories and how this relation is characterized by the Hessian of f. We also study the linearized equations of the theory on general maximally symmetric backgrounds. Remarkably, we find that these theories do not propagate the usual ghost-like massive gravitons characteristic of higher-derivative gravities on such backgrounds. In some non-trivial cases, the additional scalar associated to the trace of the metric perturbation is also absent, being the usual graviton the only dynamical field. In those cases, the linearized equations are exactly the same as in Einstein gravity up to an overall factor, making them appealing as holographic toy models. We also find constraints on the couplings of a broad family of five-dimensional f(Lovelock) theories using holographic entanglement entropy. Finally, we construct new analytic asymptotically flat and AdS/dS black hole solutions for some classes of f(Lovelock) gravities in various dimensions. △ Less

Submitted 8 April, 2016; v1 submitted 23 February, 2016; originally announced February 2016.

Comments: 46 pages, no figures; v3: minor modifications to match published version, references added

Report number: IFT-UAM/CSIC-16-015

Journal ref: JHEP 1604 (2016) 028

arXiv:1507.07984 [pdf, ps, other]

A constrained optimization perspective on actor critic algorithms and application to network routing

Authors: Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra

Abstract: We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routin… ▽ More We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application. △ Less

Submitted 28 July, 2015; originally announced July 2015.

arXiv:1506.02632 [pdf, other]

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control

Authors: Prashanth L. A., Cheng Jie, Michael Fu, Steve Marcus, Csaba Szepesvári

Abstract: Cumulative prospect theory (CPT) is known to model human decisions well, with substantial empirical evidence supporting this claim. CPT works by distorting probabilities and is more general than the classic expected utility and coherent risk measures. We bring this idea to a risk-sensitive reinforcement learning (RL) setting and design algorithms for both estimation and control. The RL setting pre… ▽ More Cumulative prospect theory (CPT) is known to model human decisions well, with substantial empirical evidence supporting this claim. CPT works by distorting probabilities and is more general than the classic expected utility and coherent risk measures. We bring this idea to a risk-sensitive reinforcement learning (RL) setting and design algorithms for both estimation and control. The RL setting presents two particular challenges when CPT is applied: estimating the CPT objective requires estimations of the entire distribution of the value function and finding a randomized optimal policy. The estimation scheme that we propose uses the empirical distribution to estimate the CPT-value of a random variable. We then use this scheme in the inner loop of a CPT-value optimization procedure that is based on the well-known simulation optimization idea of simultaneous perturbation stochastic approximation (SPSA). We provide theoretical convergence guarantees for all the proposed algorithms and also illustrate the usefulness of CPT-based criteria in a traffic signal control application. △ Less

Submitted 26 February, 2016; v1 submitted 8 June, 2015; originally announced June 2015.

arXiv:1502.05577 [pdf, ps, other]

Adaptive system optimization using random directions stochastic approximation

Authors: Prashanth L. A., Shalabh Bhatnagar, Michael Fu, Steve Marcus

Abstract: We present novel algorithms for simulation optimization using random directions stochastic approximation (RDSA). These include first-order (gradient) as well as second-order (Newton) schemes. We incorporate both continuous-valued as well as discrete-valued perturbations into both our algorithms. The former are chosen to be independent and identically distributed (i.i.d.) symmetric, uniformly distr… ▽ More We present novel algorithms for simulation optimization using random directions stochastic approximation (RDSA). These include first-order (gradient) as well as second-order (Newton) schemes. We incorporate both continuous-valued as well as discrete-valued perturbations into both our algorithms. The former are chosen to be independent and identically distributed (i.i.d.) symmetric, uniformly distributed random variables (r.v.), while the latter are i.i.d., asymmetric, Bernoulli r.v.s. Our Newton algorithm, with a novel Hessian estimation scheme, requires N-dimensional perturbations and three loss measurements per iteration, whereas the simultaneous perturbation Newton search algorithm of [1] requires 2N-dimensional perturbations and four loss measurements per iteration. We prove the unbiasedness of both gradient and Hessian estimates and asymptotic (strong) convergence for both first-order and second-order schemes. We also provide asymptotic normality results, which in particular establish that the asymmetric Bernoulli variant of Newton RDSA method is better than 2SPSA of [1]. Numerical experiments are used to validate the theoretical results. △ Less

Submitted 8 August, 2015; v1 submitted 19 February, 2015; originally announced February 2015.

arXiv:1405.2690 [pdf, ps, other]

Policy Gradients for CVaR-Constrained MDPs

Authors: Prashanth L. A.

Abstract: We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR). We propose two algorithms that obtain a locally risk-optimal policy by employing four tools: stochastic approximation, mini batches, policy gradients and importance sampling. Both the algorithms incorporate a CVaR estimation procedure, along the… ▽ More We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR). We propose two algorithms that obtain a locally risk-optimal policy by employing four tools: stochastic approximation, mini batches, policy gradients and importance sampling. Both the algorithms incorporate a CVaR estimation procedure, along the lines of Bardou et al. [2009], which in turn is based on Rockafellar-Uryasev's representation for CVaR and utilize the likelihood ratio principle for estimating the gradient of the sum of one cost function (objective of the SSP) and the gradient of the CVaR of the sum of another cost function (in the constraint of SSP). The algorithms differ in the manner in which they approximate the CVaR estimates/necessary gradients - the first algorithm uses stochastic approximation, while the second employ mini-batches in the spirit of Monte Carlo methods. We establish asymptotic convergence of both the algorithms. Further, since estimating CVaR is related to rare-event simulation, we incorporate an importance sampling based variance reduction scheme into our proposed algorithms. △ Less

Submitted 12 May, 2014; originally announced May 2014.

arXiv:1403.6530 [pdf, other]

Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs

Authors: Prashanth L. A., Mohammad Ghavamzadeh

Abstract: In many sequential decision-making problems we may want to manage risk by minimizing some measure of variability in rewards in addition to maximizing a standard criterion. Variance related risk measures are among the most common risk-sensitive criteria in finance and operations research. However, optimizing many such criteria is known to be a hard problem. In this paper, we consider both discounte… ▽ More In many sequential decision-making problems we may want to manage risk by minimizing some measure of variability in rewards in addition to maximizing a standard criterion. Variance related risk measures are among the most common risk-sensitive criteria in finance and operations research. However, optimizing many such criteria is known to be a hard problem. In this paper, we consider both discounted and average reward Markov decision processes. For each formulation, we first define a measure of variability for a policy, which in turn gives us a set of risk-sensitive criteria to optimize. For each of these criteria, we derive a formula for computing its gradient. We then devise actor-critic algorithms that operate on three timescales - a TD critic on the fastest timescale, a policy gradient (actor) on the intermediate timescale, and a dual ascent for Lagrange multipliers on the slowest timescale. In the discounted setting, we point out the difficulty in estimating the gradient of the variance of the return and incorporate simultaneous perturbation approaches to alleviate this. The average setting, on the other hand, allows for an actor update using compatible features to estimate the gradient of the variance. We establish the convergence of our algorithms to locally risk-sensitive optimal policies. Finally, we demonstrate the usefulness of our algorithms in a traffic signal control application. △ Less

Submitted 18 March, 2015; v1 submitted 25 March, 2014; originally announced March 2014.

arXiv:1312.7292 [pdf, ps, other]

Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

Authors: Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar

Abstract: In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while kee** the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to… ▽ More In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while kee** the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work. △ Less

Submitted 23 March, 2014; v1 submitted 27 December, 2013; originally announced December 2013.

arXiv:1307.3176 [pdf, other]

Fast gradient descent for drifting least squares regression, with application to bandits

Authors: Nathaniel Korda, Prashanth L. A., Rémi Munos

Abstract: Online learning algorithms require to often recompute least squares regression estimates of parameters. We study improving the computational complexity of such algorithms by using stochastic gradient descent (SGD) type schemes in place of classic regression solvers. We show that SGD schemes efficiently track the true solutions of the regression problems, even in the presence of a drift. This findi… ▽ More Online learning algorithms require to often recompute least squares regression estimates of parameters. We study improving the computational complexity of such algorithms by using stochastic gradient descent (SGD) type schemes in place of classic regression solvers. We show that SGD schemes efficiently track the true solutions of the regression problems, even in the presence of a drift. This finding coupled with an $O(d)$ improvement in complexity, where $d$ is the dimension of the data, make them attractive for implementation in the big data settings. In the case when strong convexity in the regression problem is guaranteed, we provide bounds on the error both in expectation and high probability (the latter is often needed to provide theoretical guarantees for higher level algorithms), despite the drifting least squares solution. As an example of this case we prove that the regret performance of an SGD version of the PEGE linear bandit algorithm [Rusmevichientong and Tsitsiklis 2010] is worse that that of PEGE itself only by a factor of $O(\log^4 n)$. When strong convexity of the regression problem cannot be guaranteed, we investigate using an adaptive regularisation. We make an empirical study of an adaptively regularised, SGD version of LinUCB [Li et al. 2010] in a news article recommendation application, which uses the large scale news recommendation dataset from Yahoo! front page. These experiments show a large gain in computational complexity, with a consistently low tracking error and click-through-rate (CTR) performance that is $75\%$ close. △ Less

Submitted 20 November, 2014; v1 submitted 11 July, 2013; originally announced July 2013.

arXiv:1112.0795 [pdf]

An Approach to Log Management: Prototy** a Design of Agent for Log Harvesting

Authors: Mayol Arnao Reinaldo, Nuñez Luis A., Lobo Antonio

Abstract: This paper describes a work in progress implementing a solution for harvesting and transporting information logs from network devices in a e-science environment. The system is composed for servers, agents, active devices and a transporting protocol. This document describes the state of development of agents. Agents capture logs from devices, normalize, reduce and cataloged them by using metadata.… ▽ More This paper describes a work in progress implementing a solution for harvesting and transporting information logs from network devices in a e-science environment. The system is composed for servers, agents, active devices and a transporting protocol. This document describes the state of development of agents. Agents capture logs from devices, normalize, reduce and cataloged them by using metadata. Once all these processes are done, they transmit the cataloged data by using Transportation Protocol to a warehouse server. Also an agent use orchestration parameters to transmit modified logs to a data warehouse server. These parameters can be received from orchestration applications such as Taverna. The operation of the agents and the communication protocol solve some of the deficiencies of traditional logs management protocols. Finally, we show some test realized over the new prototype. △ Less

Submitted 4 December, 2011; originally announced December 2011.

arXiv:1003.5443 [pdf, ps, other]

doi 10.1016/j.geomphys.2010.06.013

Symmetries of parabolic contact structures

Authors: Lenka Zalabov\' a

Abstract: We generalize the concept of locally symmetric spaces to parabolic contact structures. We show that symmetric normal parabolic contact structures are torsion--free and some types of them have to be locally flat. We prove that each symmetry given at a point with non--zero harmonic curvature is involutive. Finally we give restrictions on number of different symmetries which can exist at such a point… ▽ More We generalize the concept of locally symmetric spaces to parabolic contact structures. We show that symmetric normal parabolic contact structures are torsion--free and some types of them have to be locally flat. We prove that each symmetry given at a point with non--zero harmonic curvature is involutive. Finally we give restrictions on number of different symmetries which can exist at such a point. △ Less

Submitted 29 March, 2010; originally announced March 2010.

Comments: 19 pages

MSC Class: 53C15; 53A40; 53C05; 53C35

Journal ref: Journal of Geometry and Physics, Volume 60, Issue 11, November 2010,1698-1709

arXiv:astro-ph/0003314 [pdf, ps, other]

Physics of Grain Alignment

Authors: Lazarian A

Abstract: Aligned grains provide one of the easiest ways to study magnetic fields in diffuse gas and molecular clouds. How reliable our conclusions about the inferred magnetic field depends critically on our understanding of the physics of grain alignment. Although grain alignment is a problem of half a century standing recent progress achieved in the field makes us believe that we are approaching the sol… ▽ More Aligned grains provide one of the easiest ways to study magnetic fields in diffuse gas and molecular clouds. How reliable our conclusions about the inferred magnetic field depends critically on our understanding of the physics of grain alignment. Although grain alignment is a problem of half a century standing recent progress achieved in the field makes us believe that we are approaching the solution of this mystery. I review basic physical processes involved in grain alignment and show why mechanisms that were favored for decades do not look so promising right now. I also discuss why the radiative torque mechanism ignored for more than 20 years looks right now the most powerful means of grain alignment. △ Less

Submitted 21 March, 2000; originally announced March 2000.

Comments: 10 pages, review for conference "Cosmic Evolution and Galaxy Formation"

Journal ref: ASPConf.Ser.215:69,2000

Showing 1–50 of 53 results for author: Ao, L