Showing 1–2 of 2 results for author: Kotera, M

Search v0.5.6 released 2020-02-24

arXiv:2108.01485 [pdf, other]

cs.LG stat.ML

Fast Estimation Method for the Stability of Ensemble Feature Selectors

Authors: Rina Onda, Zhengyan Gao, Masaaki Kotera, Kenta Oono

Abstract: It is preferred that feature selectors be \textit{stable} for better interpretabity and robust prediction. Ensembling is known to be effective for improving the stability of feature selectors. Since ensembling is time-consuming, it is desirable to reduce the computational cost to estimate the stability of the ensemble feature selectors. We propose a simulator of a feature selector, and apply it to… ▽ More It is preferred that feature selectors be \textit{stable} for better interpretabity and robust prediction. Ensembling is known to be effective for improving the stability of feature selectors. Since ensembling is time-consuming, it is desirable to reduce the computational cost to estimate the stability of the ensemble feature selectors. We propose a simulator of a feature selector, and apply it to a fast estimation of the stability of ensemble feature selectors. To the best of our knowledge, this is the first study that estimates the stability of ensemble feature selectors and reduces the computation time theoretically and empirically. △ Less

Submitted 3 August, 2021; originally announced August 2021.

Comments: 7 pages. Supplementary material 9 pages. Accepted in ICML2021 Workshop, Subset Selection in Machine Learning: From Theory to Practice (SubSetML) URL: https://sites.google.com/view/icml-2021-subsetml
arXiv:2010.00792 [pdf, other]

cs.LG stat.ML

Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis

Authors: Katsuhiko Ishiguro, Kazuya Ujihara, Ryohto Sawada, Hirotaka Akita, Masaaki Kotera

Abstract: Retrosynthesis is a problem to infer reactant compounds to synthesize a given product compound through chemical reactions. Recent studies on retrosynthesis focus on proposing more sophisticated prediction models, but the dataset to feed the models also plays an essential role in achieving the best generalizing models. Generally, a dataset that is best suited for a specific task tends to be small.… ▽ More Retrosynthesis is a problem to infer reactant compounds to synthesize a given product compound through chemical reactions. Recent studies on retrosynthesis focus on proposing more sophisticated prediction models, but the dataset to feed the models also plays an essential role in achieving the best generalizing models. Generally, a dataset that is best suited for a specific task tends to be small. In such a case, it is the standard solution to transfer knowledge from a large or clean dataset in the same domain. In this paper, we conduct a systematic and intensive examination of data transfer approaches on end-to-end generative models, in application to retrosynthesis. Experimental results show that typical data transfer methods can improve test prediction scores of an off-the-shelf Transformer baseline model. Especially, the pre-training plus fine-tuning approach boosts the accuracy scores of the baseline, achieving the new state-of-the-art. In addition, we conduct a manual inspection for the erroneous prediction results. The inspection shows that the pre-training plus fine-tuning models can generate chemically appropriate or sensible proposals in almost all cases. △ Less

Submitted 2 October, 2020; originally announced October 2020.

Search v0.5.6 released 2020-02-24