Skip to main content

Showing 1–19 of 19 results for author: Kumagai, W

.
  1. arXiv:2401.17780  [pdf, other

    cs.LG

    A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

    Authors: Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, Yutaka Matsuo

    Abstract: We study a primal-dual (PD) reinforcement learning (RL) algorithm for online constrained Markov decision processes (CMDPs). Despite its widespread practical use, the existing theoretical literature on PD-RL algorithms for this problem only provides sublinear regret guarantees and fails to ensure convergence to optimal policies. In this paper, we introduce a novel policy gradient PD algorithm with… ▽ More

    Submitted 1 July, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  2. arXiv:2311.09706  [pdf, other

    cs.AI cs.HC cs.LG

    Towards Autonomous Hypothesis Verification via Language Models with Minimal Guidance

    Authors: Shiro Takagi, Ryutaro Yamauchi, Wataru Kumagai

    Abstract: Research automation efforts usually employ AI as a tool to automate specific tasks within the research process. To create an AI that truly conduct research themselves, it must independently generate hypotheses, design verification plans, and execute verification. Therefore, we investigated if an AI itself could autonomously generate and verify hypothesis for a toy machine learning research problem… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  3. arXiv:2309.13078  [pdf, other

    cs.AI cs.LG cs.PL

    LPML: LLM-Prompting Markup Language for Mathematical Reasoning

    Authors: Ryutaro Yamauchi, Sho Sonoda, Akiyoshi Sannai, Wataru Kumagai

    Abstract: In utilizing large language models (LLMs) for mathematical reasoning, addressing the errors in the reasoning and calculation present in the generated text by LLMs is a crucial challenge. In this paper, we propose a novel framework that integrates the Chain-of-Thought (CoT) method with an external tool (Python REPL). We discovered that by prompting LLMs to generate structured text in XML-like marku… ▽ More

    Submitted 11 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  4. arXiv:2305.13185  [pdf, other

    cs.LG

    Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

    Authors: Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, **cheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo

    Abstract: Mirror descent value iteration (MDVI), an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL), has served as the basis for recent high-performing practical RL algorithms. However, despite the use of function approximation in practice, the theoretical understanding of MDVI has been limited to tabular Markov decision processes (MDPs). We study MDVI with linear fu… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: ICML 2023 accepted

  5. arXiv:2209.07036  [pdf, other

    cs.LG stat.ML

    Langevin Autoencoders for Learning Deep Latent Variable Models

    Authors: Shohei Taniguchi, Yusuke Iwasawa, Wataru Kumagai, Yutaka Matsuo

    Abstract: Markov chain Monte Carlo (MCMC), such as Langevin dynamics, is valid for approximating intractable distributions. However, its usage is limited in the context of deep latent variable models owing to costly datapoint-wise sampling iterations and slow convergence. This paper proposes the amortized Langevin dynamics (ALD), wherein datapoint-wise MCMC iterations are entirely replaced with updates of a… ▽ More

    Submitted 11 October, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: accepted at Neural Information Processing Systems (NeurIPS 2022)

  6. arXiv:2110.08092  [pdf, other

    cs.LG

    Equivariant and Invariant Reynolds Networks

    Authors: Akiyoshi Sannai, Makoto Kawano, Wataru Kumagai

    Abstract: Invariant and equivariant networks are useful in learning data with symmetry, including images, sets, point clouds, and graphs. In this paper, we consider invariant and equivariant networks for symmetries of finite groups. Invariant and equivariant networks have been constructed by various researchers using Reynolds operators. However, Reynolds operators are computationally expensive when the orde… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: 15 pages, 4 figures

  7. arXiv:2102.08759  [pdf, other

    cs.LG stat.ML

    Group Equivariant Conditional Neural Processes

    Authors: Makoto Kawano, Wataru Kumagai, Akiyoshi Sannai, Yusuke Iwasawa, Yutaka Matsuo

    Abstract: We present the group equivariant conditional neural process (EquivCNP), a meta-learning method with permutation invariance in a data set as in conventional conditional neural processes (CNPs), and it also has transformation equivariance in data space. Incorporating group equivariance, such as rotation and scaling equivariance, provides a way to consider the symmetry of real-world data. We give a d… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

  8. arXiv:2012.13882  [pdf, ps, other

    stat.ML cs.LG

    Universal Approximation Theorem for Equivariant Maps by Group CNNs

    Authors: Wataru Kumagai, Akiyoshi Sannai

    Abstract: Group symmetry is inherent in a wide variety of data distributions. Data processing that preserves symmetry is described as an equivariant map and often effective in achieving high performance. Convolutional neural networks (CNNs) have been known as models with equivariance and shown to approximate equivariant maps for some specific groups. However, universal approximation theorems for CNNs have b… ▽ More

    Submitted 27 December, 2020; originally announced December 2020.

  9. arXiv:1806.00569  [pdf, other

    stat.ML cs.LG

    Variable Selection for Nonparametric Learning with Power Series Kernels

    Authors: Kota Matsui, Wataru Kumagai, Kenta Kanamori, Mitsuaki Nishikimi, Takafumi Kanamori

    Abstract: In this paper, we propose a variable selection method for general nonparametric kernel-based estimation. The proposed method consists of two-stage estimation: (1) construct a consistent estimator of the target function, (2) approximate the estimator using a few variables by l1-type penalized estimation. We see that the proposed method can be applied to various kernel nonparametric estimation such… ▽ More

    Submitted 4 December, 2018; v1 submitted 1 June, 2018; originally announced June 2018.

    Comments: 24 pages, 3 tables, 2 figures

  10. arXiv:1711.07693  [pdf, other

    stat.ML cs.LG

    Regret Analysis for Continuous Dueling Bandit

    Authors: Wataru Kumagai

    Abstract: The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions. In this research, we address a dueling bandit problem based on a cost function over a continuous space. We propose a stochastic mirror descent algorithm and show that the algorithm achieves an $O(\sqrt{T\log T})$-regret bound under stron… ▽ More

    Submitted 12 December, 2017; v1 submitted 21 November, 2017; originally announced November 2017.

    Comments: 14 pages. This paper was accepted at NIPS 2017 as a spotlight presentation

  11. arXiv:1610.08696  [pdf, ps, other

    stat.ML cs.LG

    Learning Bound for Parameter Transfer Learning

    Authors: Wataru Kumagai

    Abstract: We consider a transfer-learning problem by using the parameter transfer approach, where a suitable parameter of feature map** is learned through one task and applied to another objective task. Then, we introduce the notion of the local stability and parameter transfer learnability of parametric feature map**,and thereby derive a learning bound for parameter transfer algorithms. As an applicati… ▽ More

    Submitted 17 January, 2017; v1 submitted 27 October, 2016; originally announced October 2016.

    Comments: This paper was accepted at NIPS 2016 as a poster presentation

  12. Asymptotic Compatibility between LOCC Conversion and Recovery

    Authors: Kosuke Ito, Wataru Kumagai, Masahito Hayashi

    Abstract: Recently, entanglement concentration was explicitly shown to be irreversible. However, it is still not clear what kind of states can be reversibly converted in the asymptotic setting by LOCC when neither the initial nor the target state is maximally entangled. We derive the necessary and sufficient condition for the reversibility of LOCC conversions between two bipartite pure entangled states in t… ▽ More

    Submitted 21 August, 2015; v1 submitted 12 April, 2015; originally announced April 2015.

    Comments: 16 pages, 6 figures

    Journal ref: Phys. Rev. A 92, 052308 (2015)

  13. arXiv:1409.3912  [pdf, other

    stat.ML cs.LG

    Parallel Distributed Block Coordinate Descent Methods based on Pairwise Comparison Oracle

    Authors: Kota Matsui, Wataru Kumagai, Takafumi Kanamori

    Abstract: This paper provides a block coordinate descent algorithm to solve unconstrained optimization problems. In our algorithm, computation of function values or gradients is not required. Instead, pairwise comparison of function values is used. Our algorithm consists of two steps; one is the direction estimate step and the other is the search step. Both steps require only pairwise comparison of function… ▽ More

    Submitted 13 September, 2014; originally announced September 2014.

  14. arXiv:1401.3781  [pdf, ps, other

    quant-ph cs.IT

    Random Number Conversion and LOCC Conversion via Restricted Storage

    Authors: Wataru Kumagai, Masahito Hayashi

    Abstract: We consider random number conversion (RNC) through random number storage with restricted size. We clarify the relation between the performance of RNC and the size of storage in the framework of first- and second- order asymptotics, and derive their rate regions. Then, we show that the results for RNC with restricted storage recover those for conventional RNC without storage in the limit of storage… ▽ More

    Submitted 21 November, 2017; v1 submitted 15 January, 2014; originally announced January 2014.

    Comments: 53 pages

  15. arXiv:1306.4166  [pdf, ps, other

    quant-ph cs.IT

    Second-Order Asymptotics of Conversions of Distributions and Entangled States Based on Rayleigh-Normal Probability Distributions

    Authors: Wataru Kumagai, Masahito Hayashi

    Abstract: We discuss the asymptotic behavior of conversions between two independent and identical distributions up to the second-order conversion rate when the conversion is produced by a deterministic function from the input probability space to the output probability space. To derive the second-order conversion rate, we introduce new probability distributions named Rayleigh-normal distributions. The famil… ▽ More

    Submitted 21 November, 2017; v1 submitted 18 June, 2013; originally announced June 2013.

    Comments: 49 pages

  16. arXiv:1305.6250  [pdf, ps, other

    quant-ph

    Trade-off between Performance and Reversibility of Entanglement Concentration for Pure Entangled State

    Authors: Wataru Kumagai, Masahito Hayashi

    Abstract: In quantum information theory, it is widely believed that entanglement concentration for bipartite pure states is asymptotically reversible. In order to examine this, we give a precise formulation of the problem, and show a trade-off relation between performance and reversibility, which implies the irreversibility of entanglement concentration. Then, we regard entanglement concentration as entangl… ▽ More

    Submitted 12 September, 2013; v1 submitted 27 May, 2013; originally announced May 2013.

    Comments: 10 pages, 5 figures. Harrow & Lo's paper and Hayden & Winter's paper were added in references and a relation between those papers and the paper was clarified. The title was changed

    Journal ref: Phys. Rev. Lett. 111, 130407 (2013)

  17. arXiv:1303.0669  [pdf, ps, other

    cs.IT

    Second Order Asymptotics for Random Number Generation

    Authors: Wataru Kumagai, Masahito Hayashi

    Abstract: We treat a random number generation from an i.i.d. probability distribution of $P$ to that of $Q$. When $Q$ or $P$ is a uniform distribution, the problems have been well-known as the uniform random number generation and the resolvability problem respectively, and analyzed not only in the context of the first order asymptotic theory but also that in the second asymptotic theory. On the other hand,… ▽ More

    Submitted 4 March, 2013; originally announced March 2013.

    Comments: 6 pages, 3 figures

    MSC Class: 94A15

  18. arXiv:1205.4370  [pdf, ps, other

    quant-ph

    Irreversibility of Entanglement Concentration for Pure State

    Authors: Wataru Kumagai, Masahito Hayashi

    Abstract: For a pure state $ψ$ on a composite system $\mathcal{H}_A\otimes\mathcal{H}_B$, both the entanglement cost $E_C(ψ)$ and the distillable entanglement $E_D(ψ)$ coincide with the von Neumann entropy $H(\mathrm{Tr}_{B}ψ)$. Therefore, the entanglement concentration from the multiple state $ψ^{\otimes n}$ of a pure state $ψ$ to the multiple state $Φ^{\otimes L_n}$ of the EPR state $Φ$ seems to be able t… ▽ More

    Submitted 19 May, 2012; originally announced May 2012.

    Comments: 6 pages, 1 figure

  19. arXiv:1110.6255  [pdf, ps, other

    quant-ph math.ST

    Quantum hypothesis testing for quantum Gaussian states: Quantum analogues of chi-square, t and F tests

    Authors: Wataru Kumagai, Masahito Hayashi

    Abstract: We treat quantum counterparts of testing problems whose optimal tests are given by chi-square, t and F tests. These quantum counterparts are formulated as quantum hypothesis testing problems concerning quantum Gaussian states families, and contain disturbance parameters, which have group symmetry. Quantum Hunt-Stein Theorem removes a part of these disturbance parameters, but other types of difficu… ▽ More

    Submitted 28 October, 2011; originally announced October 2011.

    Comments: 34 pages, 3 figures

    Journal ref: Communications in Mathematical Physics, 318(2), 535-574, 2013