Search | arXiv e-print repository

A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI

Authors: Haruka Kita, Sotetsu Koyamada, Yotaro Yamaguchi, Shin Ishii

Abstract: Contract bridge, a cooperative game characterized by imperfect information and multi-agent dynamics, poses significant challenges and serves as a critical benchmark in artificial intelligence (AI) research. Success in this domain requires agents to effectively cooperate with their partners. This study demonstrates that an appropriate combination of existing methods can perform surprisingly well in… ▽ More Contract bridge, a cooperative game characterized by imperfect information and multi-agent dynamics, poses significant challenges and serves as a critical benchmark in artificial intelligence (AI) research. Success in this domain requires agents to effectively cooperate with their partners. This study demonstrates that an appropriate combination of existing methods can perform surprisingly well in bridge bidding against WBridge5, a leading benchmark in the bridge bidding system and a multiple-time World Computer-Bridge Championship winner. Our approach is notably simple, yet it outperforms the current state-of-the-art methodologies in this field. Furthermore, we have made our code and models publicly available as open-source software. This initiative provides a strong starting foundation for future bridge AI research, facilitating the development and verification of new strategies and advancements in the field. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted version of IEEE CoG 2024

arXiv:2406.00424 [pdf, other]

A Batch Sequential Halving Algorithm without Performance Degradation

Authors: Sotetsu Koyamada, Soichiro Nishimori, Shin Ishii

Abstract: In this paper, we investigate the problem of pure exploration in the context of multi-armed bandits, with a specific focus on scenarios where arms are pulled in fixed-size batches. Batching has been shown to enhance computational efficiency, but it can potentially lead to a degradation compared to the original sequential algorithm's performance due to delayed feedback and reduced adaptability. We… ▽ More In this paper, we investigate the problem of pure exploration in the context of multi-armed bandits, with a specific focus on scenarios where arms are pulled in fixed-size batches. Batching has been shown to enhance computational efficiency, but it can potentially lead to a degradation compared to the original sequential algorithm's performance due to delayed feedback and reduced adaptability. We introduce a simple batch version of the Sequential Halving (SH) algorithm (Karnin et al., 2013) and provide theoretical evidence that batching does not degrade the performance of the original algorithm under practical conditions. Furthermore, we empirically validate our claim through experiments, demonstrating the robust nature of the SH algorithm in fixed-size batch settings. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: Accepted to RLC 2024

arXiv:2304.09769 [pdf, other]

End-to-End Policy Gradient Method for POMDPs and Explainable Agents

Authors: Soichiro Nishimori, Sotetsu Koyamada, Shin Ishii

Abstract: Real-world decision-making problems are often partially observable, and many can be formulated as a Partially Observable Markov Decision Process (POMDP). When we apply reinforcement learning (RL) algorithms to the POMDP, reasonable estimation of the hidden states can help solve the problems. Furthermore, explainable decision-making is preferable, considering their application to real-world tasks s… ▽ More Real-world decision-making problems are often partially observable, and many can be formulated as a Partially Observable Markov Decision Process (POMDP). When we apply reinforcement learning (RL) algorithms to the POMDP, reasonable estimation of the hidden states can help solve the problems. Furthermore, explainable decision-making is preferable, considering their application to real-world tasks such as autonomous driving cars. We proposed an RL algorithm that estimates the hidden states by end-to-end training, and visualize the estimation as a state-transition graph. Experimental results demonstrated that the proposed algorithm can solve simple POMDP problems and that the visualization makes the agent's behavior interpretable to humans. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: 10 pagee, 6 figures

arXiv:2303.17503 [pdf, other]

Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning

Authors: Sotetsu Koyamada, Shinri Okano, Soichiro Nishimori, Yu Murata, Keigo Habara, Haruka Kita, Shin Ishii

Abstract: We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging JAX's auto-vectorization and parallelization over accelerators, Pgx can efficiently scale to thousands of simultaneous simulations over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-1… ▽ More We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging JAX's auto-vectorization and parallelization over accelerators, Pgx can efficiently scale to thousands of simultaneous simulations over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-100x faster than existing implementations available in Python. Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go. Additionally, Pgx offers miniature game sets and baseline models to facilitate rapid research cycles. We demonstrate the efficient training of the Gumbel AlphaZero algorithm with Pgx environments. Overall, Pgx provides high-performance environment simulators for researchers to accelerate their RL experiments. Pgx is available at http://github.com/sotetsuk/pgx. △ Less

Submitted 15 January, 2024; v1 submitted 28 March, 2023; originally announced March 2023.

arXiv:2102.03777 [pdf, other]

doi 10.1109/TNSRE.2021.3111689

EEGFuseNet: Hybrid Unsupervised Deep Feature Characterization and Fusion for High-Dimensional EEG with An Application to Emotion Recognition

Authors: Zhen Liang, Rushuang Zhou, Li Zhang, Linling Li, Gan Huang, Zhiguo Zhang, Shin Ishii

Abstract: How to effectively and efficiently extract valid and reliable features from high-dimensional electroencephalography (EEG), particularly how to fuse the spatial and temporal dynamic brain information into a better feature representation, is a critical issue in brain data analysis. Most current EEG studies work in a task driven manner and explore the valid EEG features with a supervised model, which… ▽ More How to effectively and efficiently extract valid and reliable features from high-dimensional electroencephalography (EEG), particularly how to fuse the spatial and temporal dynamic brain information into a better feature representation, is a critical issue in brain data analysis. Most current EEG studies work in a task driven manner and explore the valid EEG features with a supervised model, which would be limited by the given labels to a great extent. In this paper, we propose a practical hybrid unsupervised deep convolutional recurrent generative adversarial network based EEG feature characterization and fusion model, which is termed as EEGFuseNet. EEGFuseNet is trained in an unsupervised manner, and deep EEG features covering both spatial and temporal dynamics are automatically characterized. Comparing to the existing features, the characterized deep EEG features could be considered to be more generic and independent of any specific EEG task. The performance of the extracted deep and low-dimensional features by EEGFuseNet is carefully evaluated in an unsupervised emotion recognition application based on three public emotion databases. The results demonstrate the proposed EEGFuseNet is a robust and reliable model, which is easy to train and performs efficiently in the representation and fusion of dynamic EEG features. In particular, EEGFuseNet is established as an optimal unsupervised fusion model with promising cross-subject emotion recognition performance. It proves EEGFuseNet is capable of characterizing and fusing deep features that imply comparative cortical dynamic significance corresponding to the changing of different emotion states, and also demonstrates the possibility of realizing EEG based cross-subject emotion recognition in a pure unsupervised manner. △ Less

Submitted 27 August, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

Journal ref: IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29(2021) 1913-1925

arXiv:2004.10026 [pdf, other]

ExerSense: Real-Tme Physical Exercise Segmentation, Classification, and Counting Algorithm Using an IMU Sensor

Authors: Shun Ishii, Kizito Nkurikiyeyezu, Anna Yokokubo, Guillaume Lopez

Abstract: Even though it is well known that physical exercises have numerous emotional and physical health benefits, maintaining a regular exercise routine is quite challenging. Fortunately, there exist technologies that promote physical activity. Nonetheless, almost all of these technologies only target a narrow set of physical activities (e.g., either running or walking but not both) and are only applicab… ▽ More Even though it is well known that physical exercises have numerous emotional and physical health benefits, maintaining a regular exercise routine is quite challenging. Fortunately, there exist technologies that promote physical activity. Nonetheless, almost all of these technologies only target a narrow set of physical activities (e.g., either running or walking but not both) and are only applicable either in indoor or in outdoor environments, but do not work well in both environments. This paper introduces a real-time segmentation and classification algorithm that recognizes physical exercises and that works well in both indoor and outdoor environments. The proposed algorithm achieves a 95\% classification accuracy for five indoor and outdoor exercises, including segmentation error. This accuracy is similar or better than previous works that handled only indoor workouts and those use a vision-based approach. Moreover, while comparable machine learning-based approaches need a lot of training data, the proposed correlation-based method needs one sample of motion data of each target exercises. △ Less

Submitted 21 April, 2020; originally announced April 2020.

arXiv:1908.00876 [pdf, other]

MarmoNet: a pipeline for automated projection map** of the common marmoset brain from whole-brain serial two-photon tomography

Authors: Henrik Skibbe, Akiya Watakabe, Ken Nakae, Carlos Enrique Gutierrez, Hiromichi Tsukada, Junichi Hata, Takashi Kawase, Rui Gong, Alexander Woodward, Kenji Doya, Hideyuki Okano, Tetsuo Yamamori, Shin Ishii

Abstract: Understanding the connectivity in the brain is an important prerequisite for understanding how the brain processes information. In the Brain/MINDS project, a connectivity study on marmoset brains uses two-photon microscopy fluorescence images of axonal projections to collect the neuron connectivity from defined brain regions at the mesoscopic scale. The processing of the images requires the detect… ▽ More Understanding the connectivity in the brain is an important prerequisite for understanding how the brain processes information. In the Brain/MINDS project, a connectivity study on marmoset brains uses two-photon microscopy fluorescence images of axonal projections to collect the neuron connectivity from defined brain regions at the mesoscopic scale. The processing of the images requires the detection and segmentation of the axonal tracer signal. The objective is to detect as much tracer signal as possible while not misclassifying other background structures as the signal. This can be challenging because of imaging noise, a cluttered image background, distortions or varying image contrast cause problems. We are develo** MarmoNet, a pipeline that processes and analyzes tracer image data of the common marmoset brain. The pipeline incorporates state-of-the-art machine learning techniques based on artificial convolutional neural networks (CNN) and image registration techniques to extract and map all relevant information in a robust manner. The pipeline processes new images in a fully automated way. This report introduces the current state of the tracer signal analysis part of the pipeline. △ Less

Submitted 2 August, 2019; originally announced August 2019.

arXiv:1711.06564 [pdf, other]

Efficient Diverse Ensemble for Discriminative Co-Tracking

Authors: Kourosh Meshgi, Shigeyuki Oba, Shin Ishii

Abstract: Ensemble discriminative tracking utilizes a committee of classifiers, to label data samples, which are in turn, used for retraining the tracker to localize the target using the collective knowledge of the committee. Committee members could vary in their features, memory update schemes, or training data, however, it is inevitable to have committee members that excessively agree because of large ove… ▽ More Ensemble discriminative tracking utilizes a committee of classifiers, to label data samples, which are in turn, used for retraining the tracker to localize the target using the collective knowledge of the committee. Committee members could vary in their features, memory update schemes, or training data, however, it is inevitable to have committee members that excessively agree because of large overlaps in their version space. To remove this redundancy and have an effective ensemble learning, it is critical for the committee to include consistent hypotheses that differ from one-another, covering the version space with minimum overlaps. In this study, we propose an online ensemble tracker that directly generates a diverse committee by generating an efficient set of artificial training. The artificial data is sampled from the empirical distribution of the samples taken from both target and background, whereas the process is governed by query-by-committee to shrink the overlap between classifiers. The experimental results demonstrate that the proposed scheme outperforms conventional ensemble trackers on public benchmarks. △ Less

Submitted 7 June, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

Comments: CVPR 2018 Submission

arXiv:1706.10031 [pdf, other]

Neural Sequence Model Training via $α$-divergence Minimization

Authors: Sotetsu Koyamada, Yuta Kikuchi, Atsunori Kanemura, Shin-ichi Maeda, Shin Ishii

Abstract: We propose a new neural sequence model training method in which the objective function is defined by $α$-divergence. We demonstrate that the objective function generalizes the maximum-likelihood (ML)-based and reinforcement learning (RL)-based objective functions as special cases (i.e., ML corresponds to $α\to 0$ and RL to $α\to1$). We also show that the gradient of the objective function can be c… ▽ More We propose a new neural sequence model training method in which the objective function is defined by $α$-divergence. We demonstrate that the objective function generalizes the maximum-likelihood (ML)-based and reinforcement learning (RL)-based objective functions as special cases (i.e., ML corresponds to $α\to 0$ and RL to $α\to1$). We also show that the gradient of the objective function can be considered a mixture of ML- and RL-based objective gradients. The experimental results of a machine translation task show that minimizing the objective function with $α> 0$ outperforms $α\to 0$, which corresponds to ML-based methods. △ Less

Submitted 30 June, 2017; originally announced June 2017.

Comments: 2017 ICML Workshop on Learning to Generate Natural Language (LGNL 2017)

arXiv:1704.08821 [pdf, other]

Active Collaborative Ensemble Tracking

Authors: Kourosh Meshgi, Maryam Sadat Mirzaei, Shigeyuki Oba, Shin Ishii

Abstract: A discriminative ensemble tracker employs multiple classifiers, each of which casts a vote on all of the obtained samples. The votes are then aggregated in an attempt to localize the target object. Such method relies on collective competence and the diversity of the ensemble to approach the target/non-target classification task from different views. However, by updating all of the ensemble using a… ▽ More A discriminative ensemble tracker employs multiple classifiers, each of which casts a vote on all of the obtained samples. The votes are then aggregated in an attempt to localize the target object. Such method relies on collective competence and the diversity of the ensemble to approach the target/non-target classification task from different views. However, by updating all of the ensemble using a shared set of samples and their final labels, such diversity is lost or reduced to the diversity provided by the underlying features or internal classifiers' dynamics. Additionally, the classifiers do not exchange information with each other while striving to serve the collective goal, i.e., better classification. In this study, we propose an active collaborative information exchange scheme for ensemble tracking. This, not only orchestrates different classifier towards a common goal but also provides an intelligent update mechanism to keep the diversity of classifiers and to mitigate the shortcomings of one with the others. The data exchange is optimized with regard to an ensemble uncertainty utility function, and the ensemble is updated via co-training. The evaluations demonstrate promising results realized by the proposed algorithm for the real-world online tracking. △ Less

Submitted 28 April, 2017; originally announced April 2017.

Comments: AVSS 2017 Submission

arXiv:1704.03976 [pdf, other]

Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning

Authors: Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Shin Ishii

Abstract: We propose a new regularization method based on virtual adversarial loss: a new measure of local smoothness of the conditional label distribution given input. Virtual adversarial loss is defined as the robustness of the conditional label distribution around each input data point against local perturbation. Unlike adversarial training, our method defines the adversarial direction without label info… ▽ More We propose a new regularization method based on virtual adversarial loss: a new measure of local smoothness of the conditional label distribution given input. Virtual adversarial loss is defined as the robustness of the conditional label distribution around each input data point against local perturbation. Unlike adversarial training, our method defines the adversarial direction without label information and is hence applicable to semi-supervised learning. Because the directions in which we smooth the model are only "virtually" adversarial, we call our method virtual adversarial training (VAT). The computational cost of VAT is relatively low. For neural networks, the approximated gradient of virtual adversarial loss can be computed with no more than two pairs of forward- and back-propagations. In our experiments, we applied VAT to supervised and semi-supervised learning tasks on multiple benchmark datasets. With a simple enhancement of the algorithm based on the entropy minimization principle, our VAT achieves state-of-the-art performance for semi-supervised learning tasks on SVHN and CIFAR-10. △ Less

Submitted 27 June, 2018; v1 submitted 12 April, 2017; originally announced April 2017.

Comments: To be appeared in IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:1704.00299 [pdf, other]

Efficient Version-Space Reduction for Visual Tracking

Authors: Kourosh Meshgi, Shigeyuki Oba, Shin Ishii

Abstract: Discrminative trackers, employ a classification approach to separate the target from its background. To cope with variations of the target shape and appearance, the classifier is updated online with different samples of the target and the background. Sample selection, labeling and updating the classifier is prone to various sources of errors that drift the tracker. We introduce the use of an effic… ▽ More Discrminative trackers, employ a classification approach to separate the target from its background. To cope with variations of the target shape and appearance, the classifier is updated online with different samples of the target and the background. Sample selection, labeling and updating the classifier is prone to various sources of errors that drift the tracker. We introduce the use of an efficient version space shrinking strategy to reduce the labeling errors and enhance its sampling strategy by measuring the uncertainty of the tracker about the samples. The proposed tracker, utilize an ensemble of classifiers that represents different hypotheses about the target, diversify them using boosting to provide a larger and more consistent coverage of the version-space and tune the classifiers' weights in voting. The proposed system adjusts the model update rate by promoting the co-training of the short-memory ensemble with a long-memory oracle. The proposed tracker outperformed state-of-the-art trackers on different sequences bearing various tracking challenges. △ Less

Submitted 2 April, 2017; originally announced April 2017.

Comments: CRV'17 Conference

arXiv:1704.00083 [pdf, other]

Efficient Asymmetric Co-Tracking using Uncertainty Sampling

Authors: Kourosh Meshgi, Maryam Sadat Mirzaei, Shigeyuki Oba, Shin Ishii

Abstract: Adaptive tracking-by-detection approaches are popular for tracking arbitrary objects. They treat the tracking problem as a classification task and use online learning techniques to update the object model. However, these approaches are heavily invested in the efficiency and effectiveness of their detectors. Evaluating a massive number of samples for each frame (e.g., obtained by a sliding window)… ▽ More Adaptive tracking-by-detection approaches are popular for tracking arbitrary objects. They treat the tracking problem as a classification task and use online learning techniques to update the object model. However, these approaches are heavily invested in the efficiency and effectiveness of their detectors. Evaluating a massive number of samples for each frame (e.g., obtained by a sliding window) forces the detector to trade the accuracy in favor of speed. Furthermore, misclassification of borderline samples in the detector introduce accumulating errors in tracking. In this study, we propose a co-tracking based on the efficient cooperation of two detectors: a rapid adaptive exemplar-based detector and another more sophisticated but slower detector with a long-term memory. The sampling labeling and co-learning of the detectors are conducted by an uncertainty sampling unit, which improves the speed and accuracy of the system. We also introduce a budgeting mechanism which prevents the unbounded growth in the number of examples in the first detector to maintain its rapid response. Experiments demonstrate the efficiency and effectiveness of the proposed tracker against its baselines and its superior performance against state-of-the-art trackers on various benchmark videos. △ Less

Submitted 31 March, 2017; originally announced April 2017.

Comments: Submitted to IEEE ICSIPA'2017

arXiv:1507.00677 [pdf, other]

Distributional Smoothing with Virtual Adversarial Training

Authors: Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Ken Nakae, Shin Ishii

Abstract: We propose local distributional smoothness (LDS), a new notion of smoothness for statistical model that can be used as a regularization term to promote the smoothness of the model distribution. We named the LDS based regularization as virtual adversarial training (VAT). The LDS of a model at an input datapoint is defined as the KL-divergence based robustness of the model distribution against local… ▽ More We propose local distributional smoothness (LDS), a new notion of smoothness for statistical model that can be used as a regularization term to promote the smoothness of the model distribution. We named the LDS based regularization as virtual adversarial training (VAT). The LDS of a model at an input datapoint is defined as the KL-divergence based robustness of the model distribution against local perturbation around the datapoint. VAT resembles adversarial training, but distinguishes itself in that it determines the adversarial direction from the model distribution alone without using the label information, making it applicable to semi-supervised learning. The computational cost for VAT is relatively low. For neural network, the approximated gradient of the LDS can be computed with no more than three pairs of forward and back propagations. When we applied our technique to supervised and semi-supervised learning for the MNIST dataset, it outperformed all the training methods other than the current state of the art method, which is based on a highly advanced generative model. We also applied our method to SVHN and NORB, and confirmed our method's superior performance over the current state of the art semi-supervised method applied to these datasets. △ Less

Submitted 11 June, 2016; v1 submitted 2 July, 2015; originally announced July 2015.

Comments: Under review as a conference paper at ICLR 2016

arXiv:1502.00093 [pdf, other]

Deep learning of fMRI big data: a novel approach to subject-transfer decoding

Authors: Sotetsu Koyamada, Yumi Shikauchi, Ken Nakae, Masanori Koyama, Shin Ishii

Abstract: As a technology to read brain states from measurable brain activities, brain decoding are widely applied in industries and medical sciences. In spite of high demands in these applications for a universal decoder that can be applied to all individuals simultaneously, large variation in brain activities across individuals has limited the scope of many studies to the development of individual-specifi… ▽ More As a technology to read brain states from measurable brain activities, brain decoding are widely applied in industries and medical sciences. In spite of high demands in these applications for a universal decoder that can be applied to all individuals simultaneously, large variation in brain activities across individuals has limited the scope of many studies to the development of individual-specific decoders. In this study, we used deep neural network (DNN), a nonlinear hierarchical model, to construct a subject-transfer decoder. Our decoder is the first successful DNN-based subject-transfer decoder. When applied to a large-scale functional magnetic resonance imaging (fMRI) database, our DNN-based decoder achieved higher decoding accuracy than other baseline methods, including support vector machine (SVM). In order to analyze the knowledge acquired by this decoder, we applied principal sensitivity analysis (PSA) to the decoder and visualized the discriminative features that are common to all subjects in the dataset. Our PSA successfully visualized the subject-independent features contributing to the subject-transferability of the trained decoder. △ Less

Submitted 31 January, 2015; originally announced February 2015.

arXiv:1412.6785 [pdf, other]

doi 10.1007/978-3-319-18038-0_48

Principal Sensitivity Analysis

Authors: Sotetsu Koyamada, Masanori Koyama, Ken Nakae, Shin Ishii

Abstract: We present a novel algorithm (Principal Sensitivity Analysis; PSA) to analyze the knowledge of the classifier obtained from supervised machine learning techniques. In particular, we define principal sensitivity map (PSM) as the direction on the input space to which the trained classifier is most sensitive, and use analogously defined k-th PSM to define a basis for the input space. We train neural… ▽ More We present a novel algorithm (Principal Sensitivity Analysis; PSA) to analyze the knowledge of the classifier obtained from supervised machine learning techniques. In particular, we define principal sensitivity map (PSM) as the direction on the input space to which the trained classifier is most sensitive, and use analogously defined k-th PSM to define a basis for the input space. We train neural networks with artificial data and real data, and apply the algorithm to the obtained supervised classifiers. We then visualize the PSMs to demonstrate the PSA's ability to decompose the knowledge acquired by the trained classifiers. △ Less

Submitted 11 March, 2015; v1 submitted 21 December, 2014; originally announced December 2014.

Showing 1–16 of 16 results for author: Ishii, S