Skip to main content

Showing 1–30 of 30 results for author: Wachi, A

.
  1. arXiv:2404.11049  [pdf, other

    cs.LG cs.AI cs.CL

    Stepwise Alignment for Constrained Language Model Policy Optimization

    Authors: Akifumi Wachi, Thien Q. Tran, Rei Sato, Takumi Tanabe, Youhei Akimoto

    Abstract: Safety and trustworthiness are indispensable requirements for real-world applications of AI systems using large language models (LLMs). This paper formulates human value alignment as an optimization problem of the language model policy to maximize reward under a safety constraint, and then proposes an algorithm, Stepwise Alignment for Constrained Policy Optimization (SACPO). One key idea behind SA… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  2. arXiv:2403.05492  [pdf, other

    math.AC

    The Strong Lefschetz Property of Gorenstein Algebras Generated by Relative Invariants

    Authors: Takahiro Nagaoka, Akihito Wachi

    Abstract: We prove the strong Lefschetz property for Artinian Gorenstein algebras generated by the relative invariants of prehomogeneous vector spaces of commutative parabolic type.

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 12 pages

    MSC Class: 13E10; 11S90; 17B10

  3. arXiv:2402.02025  [pdf, ps, other

    cs.LG cs.AI

    A Survey of Constraint Formulations in Safe Reinforcement Learning

    Authors: Akifumi Wachi, Xun Shen, Yanan Sui

    Abstract: Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent's policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite re… ▽ More

    Submitted 7 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted at IJCAI-24 survey track

  4. arXiv:2401.03786  [pdf, other

    cs.LG cs.AI cs.RO

    Long-term Safe Reinforcement Learning with Binary Feedback

    Authors: Akifumi Wachi, Wataru Hashimoto, Kazumune Hashimoto

    Abstract: Safety is an indispensable requirement for applying reinforcement learning (RL) to real problems. Although there has been a surge of safe RL algorithms proposed in recent years, most existing work typically 1) relies on receiving numeric safety feedback; 2) does not guarantee safety during the learning process; 3) limits the problem to a priori known, deterministic transition dynamics; and/or 4) a… ▽ More

    Submitted 11 January, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: Accepted to AAAI-24

  5. arXiv:2310.10076  [pdf, other

    cs.CL cs.AI

    Verbosity Bias in Preference Labeling by Large Language Models

    Authors: Keita Saito, Akifumi Wachi, Koki Wataoka, Youhei Akimoto

    Abstract: In recent years, Large Language Models (LLMs) have witnessed a remarkable surge in prevalence, altering the landscape of natural language processing and machine learning. One key factor in improving the performance of LLMs is alignment with humans achieved with Reinforcement Learning from Human Feedback (RLHF), as for many LLMs such as GPT-4, Bard, etc. In addition, recent studies are investigatin… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  6. arXiv:2310.03225  [pdf, other

    cs.LG cs.AI cs.RO

    Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

    Authors: Akifumi Wachi, Wataru Hashimoto, Xun Shen, Kazumune Hashimoto

    Abstract: Safe exploration is essential for the practical use of reinforcement learning (RL) in many real-world scenarios. In this paper, we present a generalized safe exploration (GSE) problem as a unified formulation of common safe exploration problems. We then propose a solution of the GSE problem in the form of a meta-algorithm for safe exploration, MASE, which combines an unconstrained RL algorithm wit… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  7. arXiv:2308.05306  [pdf, other

    eess.SY

    Bayesian Meta-Learning on Control Barrier Functions with Data from On-Board Sensors

    Authors: Wataru Hashimoto, Kazumune Hashimoto, Akifumi Wachi, Xun Shen, Masako Kishida, Shigemasa Takai

    Abstract: In this paper, we consider a way to safely navigate the robots in unknown environments using measurement data from sensory devices. The control barrier function (CBF) is one of the promising approaches to encode safety requirements of the system and the recent progress on learning-based approaches for CBF realizes online synthesis of CBF-based safe controllers with sensor measurements. However, th… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: submitted for publication

  8. arXiv:2111.04894  [pdf, other

    cs.LG cs.AI cs.RO

    Safe Policy Optimization with Local Generalized Linear Function Approximations

    Authors: Akifumi Wachi, Yunyue Wei, Yanan Sui

    Abstract: Safe exploration is a key to applying reinforcement learning (RL) in safety-critical systems. Existing safe exploration methods guaranteed safety under the assumption of regularity, and it has been difficult to apply them to large-scale real problems. We propose a novel algorithm, SPO-LF, that optimizes an agent's policy while learning the relation between a locally available feature obtained by s… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: 18 pages, 6 figures, Accepted to NeurIPS-21

  9. arXiv:2110.12214  [pdf, other

    math.OC

    Learning-based Event-triggered MPC with Gaussian processes under terminal constraints

    Authors: Yuga Onoue, Kazumune Hashimoto, Akifumi Wachi

    Abstract: Event-triggered control strategy is capable of significantly reducing the number of control task executions without sacrificing control performance. In this paper, we propose a novel learning-based approach towards an event-triggered model predictive control (MPC) for nonlinear control systems whose dynamics are unknown apriori. In particular, the optimal control problems (OCPs) are formulated bas… ▽ More

    Submitted 1 January, 2024; v1 submitted 23 October, 2021; originally announced October 2021.

    Comments: submitted for publication

  10. arXiv:2110.10973  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    LOA: Logical Optimal Actions for Text-based Interaction Games

    Authors: Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, Alexander Gray

    Abstract: We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games. The demonstration for LOA experiments consists of a web-based interactive platform for text-based games and visualization for acqu… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: ACL-IJCNLP 2021 (demo paper)

  11. arXiv:2110.10963  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    Neuro-Symbolic Reinforcement Learning with First-Order Logic

    Authors: Daiki Kimura, Masaki Ono, Subhajit Chaudhury, Ryosuke Kohita, Akifumi Wachi, Don Joven Agravante, Michiaki Tatsubori, Asim Munawar, Alexander Gray

    Abstract: Deep reinforcement learning (RL) methods often require many trials before convergence, and no direct interpretability of trained policies is provided. In order to achieve fast convergence and interpretability for the policy in RL, we propose a novel RL method for text-based games with a recent neuro-symbolic framework called Logical Neural Network, which can learn symbolic and interpretable rules… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: EMNLP 2021 (main conference)

  12. arXiv:2103.02363  [pdf, other

    cs.AI

    Reinforcement Learning with External Knowledge by using Logical Neural Networks

    Authors: Daiki Kimura, Subhajit Chaudhury, Akifumi Wachi, Ryosuke Kohita, Asim Munawar, Michiaki Tatsubori, Alexander Gray

    Abstract: Conventional deep reinforcement learning methods are sample-inefficient and usually require a large number of training trials before convergence. Since such methods operate on an unconstrained action set, they can lead to useless actions. A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic.… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: KBRL Workshop at IJCAI-PRICAI 2020

  13. arXiv:2010.04379  [pdf, other

    cs.CL

    Q-learning with Language Model for Edit-based Unsupervised Summarization

    Authors: Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana

    Abstract: Unsupervised methods are promising for abstractive text summarization in that the parallel corpora is not required. However, their performance is still far from being satisfied, therefore research on promising solutions is on-going. In this paper, we propose a new approach based on Q-learning with an edit-based summarization. The method combines two key modules to form an Editorial Agent and Langu… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: 14 pages, 4 figures

  14. arXiv:2008.06626  [pdf, other

    cs.LG cs.AI cs.RO

    Safe Reinforcement Learning in Constrained Markov Decision Processes

    Authors: Akifumi Wachi, Yanan Sui

    Abstract: Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications. In this paper, we propose an algorithm, SNO-MDP, that explores and optimizes Markov decision processes under unknown safety constraints. Specifically, we take a stepwise approach for optimizing safety and cumulative reward. In our method, the agent first le… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: 10 pages, 6 figures, Accepted to ICML2020

  15. Enhanced zeta distributions and its functional equations

    Authors: Kyo Nishiyama, Bent Ørsted, Akihito Wachi

    Abstract: We consider an ``enhanced symmetric space'', which is a prehomogeneous vector space. This vector space is intimately related to a double flag variety studied in \cite{NO.2018}. On a distinguished open orbit called ``enhanced positive cone'', we consider a zeta integral with two complex variables, which is analytically continued to meromorphic family of tempered distributions. One of the main resul… ▽ More

    Submitted 5 May, 2019; originally announced May 2019.

    Comments: 12 pages, article for proceedings of GROUP32

    MSC Class: Primary 22E46; Secondary 14M15; 11S90; 22E45; 47G10

    Journal ref: IOP Conf. Series: Journal of Physics: Conf. Series 1194 (2019) 012081

  16. arXiv:1903.10654  [pdf, other

    cs.LG cs.AI cs.RO

    Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving

    Authors: Akifumi Wachi

    Abstract: We examine the problem of adversarial reinforcement learning for multi-agent domains including a rule-based agent. Rule-based algorithms are required in safety-critical applications for them to work properly in a wide range of situations. Hence, every effort is made to find failure scenarios during the development phase. However, as the software becomes complicated, finding failure cases becomes d… ▽ More

    Submitted 24 May, 2019; v1 submitted 25 March, 2019; originally announced March 2019.

  17. arXiv:1809.04232  [pdf, other

    cs.AI

    Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

    Authors: Akifumi Wachi, Hiroshi Ka**o, Asim Munawar

    Abstract: In many real-world applications (e.g., planetary exploration, robot navigation), an autonomous agent must be able to explore a space with guaranteed safety. Most safe exploration algorithms in the field of reinforcement learning and robotics have been based on the assumption that the safety features are a priori known and time-invariant. This paper presents a learning algorithm called ST-SafeMDP f… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    Comments: 12 pages, 7 figures

  18. arXiv:1806.06154  [pdf, ps, other

    math.AC

    The strong Lefschetz property for complete intersections defined by products of linear forms

    Authors: Tadahito Harima, Akihito Wachi, Junzo Watanabe

    Abstract: We prove the strong Lefschetz property for certain complete intersections defined by products of linear forms, using a characterization of the strong Lefschetz property in terms of central simple modules.

    Submitted 15 June, 2018; originally announced June 2018.

    Comments: 9 pages

    MSC Class: 13C40; 13E10

  19. arXiv:1708.01990  [pdf, ps, other

    math.AC

    Lefschetz properties for complete intersection ideals generated by products of linear forms

    Authors: Martina Juhnke-Kubitzke, Rosa M. Miró-Roig, Satoshi Murai, Akihito Wachi

    Abstract: In this paper, we study the strong Lefschetz property of artinian complete intersection ideals generated by products of linear forms. We prove the strong Lefschetz property for a class of such ideals with binomial generators.

    Submitted 7 August, 2017; originally announced August 2017.

    Comments: 9 pages

    MSC Class: 13E10; 13C40

  20. arXiv:1703.07199  [pdf, ps, other

    math.AC

    A characterization of the Macaulay dual generators for quadratic complete intersections

    Authors: Tadahito Harima, Akihito Wachi, Junzo Watanabe

    Abstract: Let $F$ be a homogeneous polynomial in $n$ variables of degree $d$ over a field $K$. Let $A(F)$ be the associated Artinian graded $K$-algebra. If $B \subset A(F)$ is a subalgebra of $A(F)$ which is Gorenstein with the same socle degree as $A(F)$, we describe the Macaulay dual generator for $B$ in terms of $F$. Furthermore when $n=d$, we give necessary and sufficient conditions on the polynomial… ▽ More

    Submitted 3 April, 2017; v1 submitted 21 March, 2017; originally announced March 2017.

  21. arXiv:1611.02801  [pdf, ps, other

    math.AC

    The resultants of quadratic binomial complete intersections

    Authors: Tadahito Harima, Akihito Wachi, Junzo Watanabe

    Abstract: We compute the resultants for quadratic binomial complete intersections. As an application we show that any quadratic binomial complete intersection can have the set of square-free monomials as a vector space basis if the generators are put in a normal form.

    Submitted 8 November, 2016; originally announced November 2016.

    Comments: 14 pages

    MSC Class: 13P15; 13F20; 13M10

  22. arXiv:1601.06928  [pdf, ps, other

    math.AC

    The EGH Conjecture and the Sperner property of complete intersections

    Authors: Tadahito Harima, Akihito Wachi, Junzo Watanabe

    Abstract: Let $A$ be a graded complete intersection over a field and $B$ the monomial complete intersection with the generators of the same degrees as $A$. The EGH conjecture says that if $I$ is a graded ideal in $A$, then there should be an ideal $J$ in $B$ such that $B/J$ and $A/I$ have the same Hilbert function. We show that if the EGH conjecture is true, then it can be used to prove that every graded co… ▽ More

    Submitted 26 January, 2016; originally announced January 2016.

    Comments: 7 pages

    MSC Class: 13M10

  23. arXiv:1501.06982  [pdf, ps, other

    math.AC

    The quadratic complete intersections with the action of the symmetric group

    Authors: Tadahito Harima, Akihito Wachi, Junzo Watanabe

    Abstract: We prove that any quadratic complete intersection with certain action of the symmetric group has the strong Lefschetz property over a field of characteristic zero. As a consequence of it we construct a new class of homogeneous complete intersections with generators of higher degrees which have the strong Lefschetz property.

    Submitted 10 May, 2015; v1 submitted 27 January, 2015; originally announced January 2015.

    Comments: 12 pages

    MSC Class: 13A50 (primary); 13A02 (secondary)

  24. arXiv:1403.7982  [pdf, ps, other

    math.RT

    Codimension one connectedness of the graph of associated varieties

    Authors: Kyo Nishiyama, Peter Trapa, Akihito Wachi

    Abstract: Let $ π$ be an irreducible Harish-Chandra $ (\mathfrak{g}, K) $-module, and denote its associated variety by $ AV(π) $. If $ AV(π) $ is reducible, then each irreducible component must contain codimension one boundary component. Thus we are interested in the codimension one adjacency of nilpotent orbits for a symmetric pair $ (G, K) $. We define the notion of orbit graph and associated graph for… ▽ More

    Submitted 8 October, 2014; v1 submitted 31 March, 2014; originally announced March 2014.

    Comments: 47 pages. Appendix is added. to appear in Tohoku Math. J

    MSC Class: 22E45; 22E46; 05E10; 05C50

  25. arXiv:0909.0365  [pdf, ps, other

    math.AC

    Generic initial ideals of some monomial complete intersections in four variables

    Authors: Tadahito Harima, Sho Sakaki, Akihito Wachi

    Abstract: Let $R = K[x_1, x_2, x_3, x_4]$ be the polynomial ring over a field of characteristic zero. For the ideal $(x_1^a, x_2^b, x_3^c, x_4^d) \subset R$, where at least one of $a$, $b$, $c$ and $d$ is equal to two, we prove that its generic initial ideal with respect to the reverse lexicographic order is the almost revlex ideal corresponding to the same Hilbert function.

    Submitted 2 September, 2009; originally announced September 2009.

    Comments: 9 pages

    MSC Class: 13A02 (Primary); 13C40; 13F20; 13D40 (Secondary)

  26. arXiv:0809.3558  [pdf, ps, other

    math.RT math.AC math.AG

    Strong Lefschetz elements of the coinvariant rings of finite Coxeter groups

    Authors: Toshiaki Maeno, Yasuhide Numata, Akihito Wachi

    Abstract: For the coinvariant rings of finite Coxeter groups of types other than H$_4$, we show that a homogeneous element of degree one is a strong Lefschetz element if and only if it is not fixed by any reflections. We also give the necessary and sufficient condition for strong Lefschetz elements in the invariant subrings of the coinvariant rings of Weyl groups.

    Submitted 21 September, 2008; originally announced September 2008.

    Comments: 18 pages

    MSC Class: 20F55; 13A50; 14M15; 14N15

    Journal ref: Algebr. Represent. Theory 14 (2011), no. 4, 625-638

  27. arXiv:0808.0607  [pdf, ps, other

    math.RT

    A note on the Capelli identities for symmetric pairs of Hermitian type

    Authors: Kyo Nishiyama, Akihito Wachi

    Abstract: We get several identities of differential operators in determinantal form. These identities are non-commutative versions of the formula of Cauchy-Binet or Laplace expansions of determinants, and if we take principal symbols, they are reduced to such classical formulas. These identities are naturally arising from the generators of the rings of invariant differential operators over symmetric space… ▽ More

    Submitted 5 August, 2008; originally announced August 2008.

    Comments: 29 pages

    MSC Class: 17B35 (Primary) 22E46; 16S32; 15A15(Secondary)

  28. arXiv:0707.2247  [pdf, ps, other

    math.AC

    Generic initial ideals, graded Betti numbers and $k$-Lefschetz properties

    Authors: Tadahito Harima, Akihito Wachi

    Abstract: We introduce the $k$-strong Lefschetz property ($k$-SLP) and the $k$-weak Lefschetz property ($k$-WLP) for graded Artinian $K$-algebras, which are generalizations of the Lefschetz properties. The main results obtained in this paper are as follows: 1. Let $I$ be a graded ideal of $R=K[x_1, x_2, x_3]$ whose quotient ring $R/I$ has the SLP. Then the generic initial ideal of $I$ is the unique almo… ▽ More

    Submitted 19 July, 2007; v1 submitted 15 July, 2007; originally announced July 2007.

    Comments: 36 pages; a refererence [CP07] added

    MSC Class: 13A02 (Primary) 13D07; 13F20; 13D40; 13C05 (Secondary)

  29. The strong Lefschetz property of the coinvariant ring of the Coxeter group of type H4

    Authors: Yasuhide Numata, Akihito Wachi

    Abstract: We prove that the coinvariant ring of the irreducible Coxeter group of type H4 has the strong Lefschetz property.

    Submitted 28 February, 2007; originally announced March 2007.

    Comments: 9 pages

    MSC Class: 20F55; 13A50; 14M15; 14N15

    Journal ref: J. Algebra 318 (2007), no. 2, 1032-1038

  30. arXiv:math/0510033  [pdf, ps, other

    math.RT

    Intersection of harmonics and Capelli identities for symmetric pairs

    Authors: Soo Teck Lee, Kyo Nishiyama, Akihito Wachi

    Abstract: We consider a see-saw pair consisting of a Hermitian symmetric pair (G_R, K_R) and a compact symmetric pair (M_R, H_R), where (G_R, H_R) and (K_R, M_R) form real reductive dual pairs in a large symplectic group. In this setting, we get Capelli identities which explicitly represent certain K_C-invariant elements in U(Lie(G)_C) in terms of H_C-invariant elements in U(Lie(M)_C). The corresponding H… ▽ More

    Submitted 3 October, 2005; originally announced October 2005.

    Comments: 20 pages

    MSC Class: 17B35; 22E46 16S32 15A15