Search | arXiv e-print repository

A Survey of Pre-trained Language Models for Processing Scientific Text

Authors: Xanh Ho, Anh Khoa Duong Nguyen, An Tuan Dao, Junfeng Jiang, Yuki Chida, Kaito Sugimoto, Huy Quoc To, Florian Boudin, Akiko Aizawa

Abstract: The number of Language Models (LMs) dedicated to processing scientific text is on the rise. Kee** pace with the rapid growth of scientific LMs (SciLMs) has become a daunting task for researchers. To date, no comprehensive surveys on SciLMs have been undertaken, leaving this issue unaddressed. Given the constant stream of new SciLMs, appraising the state-of-the-art and how they compare to each ot… ▽ More The number of Language Models (LMs) dedicated to processing scientific text is on the rise. Kee** pace with the rapid growth of scientific LMs (SciLMs) has become a daunting task for researchers. To date, no comprehensive surveys on SciLMs have been undertaken, leaving this issue unaddressed. Given the constant stream of new SciLMs, appraising the state-of-the-art and how they compare to each other remain largely unknown. This work fills that gap and provides a comprehensive review of SciLMs, including an extensive analysis of their effectiveness across different domains, tasks and datasets, and a discussion on the challenges that lie ahead. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: Resources are available at https://github.com/Alab-NII/Awesome-SciLM

arXiv:2108.00625 [pdf, other]

Adaptive t-Momentum-based Optimization for Unknown Ratio of Outliers in Amateur Data in Imitation Learning

Authors: Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto

Abstract: Behavioral cloning (BC) bears a high potential for safe and direct transfer of human skills to robots. However, demonstrations performed by human operators often contain noise or imperfect behaviors that can affect the efficiency of the imitator if left unchecked. In order to allow the imitators to effectively learn from imperfect demonstrations, we propose to employ the robust t-momentum optimiza… ▽ More Behavioral cloning (BC) bears a high potential for safe and direct transfer of human skills to robots. However, demonstrations performed by human operators often contain noise or imperfect behaviors that can affect the efficiency of the imitator if left unchecked. In order to allow the imitators to effectively learn from imperfect demonstrations, we propose to employ the robust t-momentum optimization algorithm. This algorithm builds on the Student's t-distribution in order to deal with heavy-tailed data and reduce the effect of outlying observations. We extend the t-momentum algorithm to allow for an adaptive and automatic robustness and show empirically how the algorithm can be used to produce robust BC imitators against datasets with unknown heaviness. Indeed, the imitators trained with the t-momentum-based Adam optimizers displayed robustness to imperfect demonstrations on two different manipulation tasks with different robots and revealed the capability to take advantage of the additional data while reducing the adverse effect of non-optimal behaviors. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: 7 pages, Accepted in IROS 2021, See video pitch with main result on https://youtu.be/FfW_moT1-wU

arXiv:2003.00179 [pdf, other]

TAdam: A Robust Stochastic Gradient Optimizer

Authors: Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto

Abstract: Machine learning algorithms aim to find patterns from observations, which may include some noise, especially in robotics domain. To perform well even with such noise, we expect them to be able to detect outliers and discard them when needed. We therefore propose a new stochastic gradient optimization method, whose robustness is directly built in the algorithm, using the robust student-t distributi… ▽ More Machine learning algorithms aim to find patterns from observations, which may include some noise, especially in robotics domain. To perform well even with such noise, we expect them to be able to detect outliers and discard them when needed. We therefore propose a new stochastic gradient optimization method, whose robustness is directly built in the algorithm, using the robust student-t distribution as its core idea. Adam, the popular optimization method, is modified with our method and the resultant optimizer, so-called TAdam, is shown to effectively outperform Adam in terms of robustness against noise on diverse task, ranging from regression and classification to reinforcement learning problems. The implementation of our algorithm can be found at https://github.com/Mahoumaru/TAdam.git △ Less

Submitted 2 March, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

Comments: 9 pages

arXiv:1902.02954 [pdf, other]

Synergistic Effects in Networked Epidemic Spreading Dynamics

Authors: Masaki Ogura, Wenjie Mei, Kenji Sugimoto

Abstract: In this brief, we study epidemic spreading dynamics taking place in complex networks. We specifically investigate the effect of synergy, where multiple interactions between nodes result in a combined effect larger than the simple sum of their separate effects. Although synergistic effects play key roles in various biological and social phenomena, their analyses have been often performed by means o… ▽ More In this brief, we study epidemic spreading dynamics taking place in complex networks. We specifically investigate the effect of synergy, where multiple interactions between nodes result in a combined effect larger than the simple sum of their separate effects. Although synergistic effects play key roles in various biological and social phenomena, their analyses have been often performed by means of approximation techniques and for limited types of networks. In order to address this limitation, this paper proposes a rigorous approach to quantitatively understand the effect of synergy in the Susceptible-Infected-Susceptible model taking place in an arbitrary complex network. We derive an upper bound on the growth rate of the synergistic Susceptible-Infected-Susceptible model in terms of the eigenvalues of a matrix whose size grows quadratically with the number of the nodes in the network. We confirm the effectiveness of our result by numerical simulations on empirically observed human and animal social networks. △ Less

Submitted 19 April, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

Comments: Accepted for publication in IEEE Transactions on Circuits and Systems II: Express Briefs

Showing 1–4 of 4 results for author: Sugimoto, K