-
Optimal Bayesian predictive probability for delayed response in single-arm clinical trials with binary efficacy outcome
Authors:
Takuya Yoshimoto,
Satoru Shinoda,
Kouji Yamamoto,
Kouji Tahata
Abstract:
In oncology, phase II or multiple expansion cohort trials are crucial for clinical development plans. This is because they aid in identifying potent agents with sufficient activity to continue development and confirm the proof of concept. Typically, these clinical trials are single-arm trials, with the primary endpoint being short-term treatment efficacy. Despite the development of several well-de…
▽ More
In oncology, phase II or multiple expansion cohort trials are crucial for clinical development plans. This is because they aid in identifying potent agents with sufficient activity to continue development and confirm the proof of concept. Typically, these clinical trials are single-arm trials, with the primary endpoint being short-term treatment efficacy. Despite the development of several well-designed methodologies, there may be a practical impediment in that the endpoints may be observed within a sufficient time such that adaptive go/no-go decisions can be made in a timely manner at each interim monitoring. Specifically, Response Evaluation Criteria in Solid Tumors guideline defines a confirmed response and necessitates it in non-randomized trials, where the response is the primary endpoint. However, obtaining the confirmed outcome from all participants entered at interim monitoring may be time-consuming as non-responders should be followed up until the disease progresses. Thus, this study proposed an approach to accelerate the decision-making process that incorporated the outcome without confirmation by discounting its contribution to the decision-making framework using the generalized Bayes' theorem. Further, the behavior of the proposed approach was evaluated through a simple simulation study. The results demonstrated that the proposed approach made appropriate interim go/no-go decisions.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Asymptotic Properties of Matthews Correlation Coefficient
Authors:
Yuki Itaya,
Jun Tamura,
Kenichi Hayashi,
Kouji Yamamoto
Abstract:
Evaluating classifications is crucial in statistics and machine learning, as it influences decision-making across various fields, such as patient prognosis and therapy in critical conditions. The Matthews correlation coefficient (MCC) is recognized as a performance metric with high reliability, offering a balanced measurement even in the presence of class imbalances. Despite its importance, there…
▽ More
Evaluating classifications is crucial in statistics and machine learning, as it influences decision-making across various fields, such as patient prognosis and therapy in critical conditions. The Matthews correlation coefficient (MCC) is recognized as a performance metric with high reliability, offering a balanced measurement even in the presence of class imbalances. Despite its importance, there remains a notable lack of comprehensive research on the statistical inference of MCC. This deficiency often leads to studies merely validating and comparing MCC point estimates, a practice that, while common, overlooks the statistical significance and reliability of results. Addressing this research gap, our paper introduces and evaluates several methods to construct asymptotic confidence intervals for the single MCC and the differences between MCCs in paired designs. Through simulations across various scenarios, we evaluate the finite-sample behavior of these methods and compare their performances. Furthermore, through real data analysis, we illustrate the potential utility of our findings in comparing binary classifiers, highlighting the possible contributions of our research in this field.
△ Less
Submitted 15 June, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Small area estimation of forest biomass via a two-stage model for continuous zero-inflated data
Authors:
Grayson W. White,
Josh K. Yamamoto,
Dinan H. Elsyad,
Julian F. Schmitt,
Niels H. Korsgaard,
Jie Kate Hu,
George C. Gaines III,
Tracey S. Frescino,
Kelly S. McConville
Abstract:
The United States (US) Forest Inventory & Analysis Program (FIA) collects data on and monitors the trends of forests in the US. FIA is increasingly interested in monitoring forest attributes such as biomass at fine geographic and temporal scales, resulting in a need for assessment and development of small area estimation techniques in forest inventory. We implement a small area estimator and param…
▽ More
The United States (US) Forest Inventory & Analysis Program (FIA) collects data on and monitors the trends of forests in the US. FIA is increasingly interested in monitoring forest attributes such as biomass at fine geographic and temporal scales, resulting in a need for assessment and development of small area estimation techniques in forest inventory. We implement a small area estimator and parametric bootstrap estimator that account for zero-inflation in biomass data via a two-stage model-based approach and compare its performance to a post-stratified estimator and to the unit- and area-level empirical best linear unbiased prediction (EBLUP) estimators. For estimator comparison, we conduct a simulation study with counties in the US state Nevada as domains based on sampled plot data and remote sensing data products. Results show the zero-inflated estimator has the lowest relative bias and the smallest empirical root mean square error. Moreover, the 95% confidence interval coverages of the zero-inflated estimator and the unit-level EBLUP are more accurate than the other two estimators. To further illustrate the practical utility, we employ a data application across the 2019 measurement year in Nevada. We introduce the R package, saeczi, which efficiently implements the zero-inflated estimator and its mean squared error estimator.
△ Less
Submitted 7 June, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
Authors:
Juno Kim,
Kakei Yamamoto,
Kazusato Oko,
Zhuoran Yang,
Taiji Suzuki
Abstract:
In this paper, we extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We propose mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establish average-iterate convergence…
▽ More
In this paper, we extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We propose mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establish average-iterate convergence to the mixed Nash equilibrium. We also study both time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result which accounts for the dependency of the particle interactions on all previous distributions. Furthermore, we propose mean-field Langevin anchored best response (MFL-ABR), a symmetric double-loop algorithm based on best response dynamics with linear last-iterate convergence. Finally, we study applications to zero-sum Markov games and conduct simulations demonstrating long-term optimality.
△ Less
Submitted 16 February, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
Bayesian predictive probability based on a bivariate index vector for single-arm phase II study with binary efficacy and safety endpoints
Authors:
Takuya Yoshimoto,
Satoru Shinoda,
Kouji Yamamoto,
Kouji Tahata
Abstract:
In oncology, phase II studies are crucial for clinical development plans as such studies identify potent agents with sufficient activity to continue development in the subsequent phase III trials. Traditionally, phase II studies are single-arm studies, with the primary endpoint being short-term treatment efficacy. However, drug safety is also an important consideration. In the context of such mult…
▽ More
In oncology, phase II studies are crucial for clinical development plans as such studies identify potent agents with sufficient activity to continue development in the subsequent phase III trials. Traditionally, phase II studies are single-arm studies, with the primary endpoint being short-term treatment efficacy. However, drug safety is also an important consideration. In the context of such multiple-outcome designs, predictive probability-based Bayesian monitoring strategies have been developed to assess whether a clinical trial will provide enough evidence to continue with a phase III study at the scheduled end of the trial. Herein, we propose a new simple index vector for summarizing results that cannot be captured by existing strategies. Specifically, for each interim monitoring time point, we calculate the Bayesian predictive probability using our new index and use it to assign a go/no-go decision. Finally, simulation studies are performed to evaluate the operating characteristics of the design. The obtained results demonstrate that the proposed method makes appropriate interim go/no-go decisions.
△ Less
Submitted 7 August, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Large-scale Gender/Age Prediction of Tumblr Users
Authors:
Yao Zhan,
Changwei Hu,
Yifan Hu,
Tejaswi Kasturi,
Shanmugam Ramasamy,
Matt Gillingham,
Keith Yamamoto
Abstract:
Tumblr, as a leading content provider and social media, attracts 371 million monthly visits, 280 million blogs and 53.3 million daily posts. The popularity of Tumblr provides great opportunities for advertisers to promote their products through sponsored posts. However, it is a challenging task to target specific demographic groups for ads, since Tumblr does not require user information like gende…
▽ More
Tumblr, as a leading content provider and social media, attracts 371 million monthly visits, 280 million blogs and 53.3 million daily posts. The popularity of Tumblr provides great opportunities for advertisers to promote their products through sponsored posts. However, it is a challenging task to target specific demographic groups for ads, since Tumblr does not require user information like gender and ages during their registration. Hence, to promote ad targeting, it is essential to predict user's demography using rich content such as posts, images and social connections. In this paper, we propose graph based and deep learning models for age and gender predictions, which take into account user activities and content features. For graph based models, we come up with two approaches, network embedding and label propagation, to generate connection features as well as directly infer user's demography. For deep learning models, we leverage convolutional neural network (CNN) and multilayer perceptron (MLP) to prediction users' age and gender. Experimental results on real Tumblr daily dataset, with hundreds of millions of active users and billions of following relations, demonstrate that our approaches significantly outperform the baseline model, by improving the accuracy relatively by 81% for age, and the AUC and accuracy by 5\% for gender.
△ Less
Submitted 2 January, 2020;
originally announced January 2020.
-
Hybrid-FL for Wireless Networks: Cooperative Learning Mechanism Using Non-IID Data
Authors:
Naoya Yoshida,
Takayuki Nishio,
Masahiro Morikura,
Koji Yamamoto,
Ryo Yonetani
Abstract:
This paper proposes a cooperative mechanism for mitigating the performance degradation due to non-independent-and-identically-distributed (non-IID) data in collaborative machine learning (ML), namely federated learning (FL), which trains an ML model using the rich data and computational resources of mobile clients without gathering their data to central systems. The data of mobile clients is typic…
▽ More
This paper proposes a cooperative mechanism for mitigating the performance degradation due to non-independent-and-identically-distributed (non-IID) data in collaborative machine learning (ML), namely federated learning (FL), which trains an ML model using the rich data and computational resources of mobile clients without gathering their data to central systems. The data of mobile clients is typically non-IID owing to diversity among mobile clients' interests and usage, and FL with non-IID data could degrade the model performance. Therefore, to mitigate the degradation induced by non-IID data, we assume that a limited number (e.g., less than 1%) of clients allow their data to be uploaded to a server, and we propose a hybrid learning mechanism referred to as Hybrid-FL, wherein the server updates the model using the data gathered from the clients and aggregates the model with the models trained by clients. The Hybrid-FL solves both client- and data-selection problems via heuristic algorithms, which try to select the optimal sets of clients who train models with their own data, clients who upload their data to the server, and data uploaded to the server. The algorithms increase the number of clients participating in FL and make more data gather in the server IID, thereby improving the prediction accuracy of the aggregated model. Evaluations, which consist of network simulations and ML experiments, demonstrate that the proposed scheme achieves a 13.5% higher classification accuracy than those of the previously proposed schemes for the non-IID case.
△ Less
Submitted 5 March, 2020; v1 submitted 17 May, 2019;
originally announced May 2019.
-
Deep Learning for Classical Japanese Literature
Authors:
Tarin Clanuwat,
Mikel Bober-Irizar,
Asanobu Kitamoto,
Alex Lamb,
Kazuaki Yamamoto,
David Ha
Abstract:
Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the perspective of ML researchers, the content of the task itself is largely irrelevant, and thus there have increasingly been calls for benchmark tasks to more heavily focus on problems which are of social or c…
▽ More
Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the perspective of ML researchers, the content of the task itself is largely irrelevant, and thus there have increasingly been calls for benchmark tasks to more heavily focus on problems which are of social or cultural relevance. In this work, we introduce Kuzushiji-MNIST, a dataset which focuses on Kuzushiji (cursive Japanese), as well as two larger, more challenging datasets, Kuzushiji-49 and Kuzushiji-Kanji. Through these datasets, we wish to engage the machine learning community into the world of classical Japanese literature. Dataset available at https://github.com/rois-codh/kmnist
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
Hierarchical Reinforcement Learning with Abductive Planning
Authors:
Kazeto Yamamoto,
Takashi Onishi,
Yoshimasa Tsuruoka
Abstract:
One of the key challenges in applying reinforcement learning to real-life problems is that the amount of train-and-error required to learn a good policy increases drastically as the task becomes complex. One potential solution to this problem is to combine reinforcement learning with automated symbol planning and utilize prior knowledge on the domain. However, existing methods have limitations in…
▽ More
One of the key challenges in applying reinforcement learning to real-life problems is that the amount of train-and-error required to learn a good policy increases drastically as the task becomes complex. One potential solution to this problem is to combine reinforcement learning with automated symbol planning and utilize prior knowledge on the domain. However, existing methods have limitations in their applicability and expressiveness. In this paper we propose a hierarchical reinforcement learning method based on abductive symbolic planning. The planner can deal with user-defined evaluation functions and is not based on the Herbrand theorem. Therefore it can utilize prior knowledge of the rewards and can work in a domain where the state space is unknown. We demonstrate empirically that our architecture significantly improves learning efficiency with respect to the amount of training examples on the evaluation domain, in which the state space is unknown and there exist multiple goals.
△ Less
Submitted 28 June, 2018;
originally announced June 2018.
-
PCAS: Pruning Channels with Attention Statistics for Deep Network Compression
Authors:
Kohei Yamamoto,
Kurato Maeno
Abstract:
Compression techniques for deep neural networks are important for implementing them on small embedded devices. In particular, channel-pruning is a useful technique for realizing compact networks. However, many conventional methods require manual setting of compression ratios in each layer. It is difficult to analyze the relationships between all layers, especially for deeper models. To address the…
▽ More
Compression techniques for deep neural networks are important for implementing them on small embedded devices. In particular, channel-pruning is a useful technique for realizing compact networks. However, many conventional methods require manual setting of compression ratios in each layer. It is difficult to analyze the relationships between all layers, especially for deeper models. To address these issues, we propose a simple channel-pruning technique based on attention statistics that enables to evaluate the importance of channels. We improved the method by means of a criterion for automatic channel selection, using a single compression ratio for the entire model in place of per-layer model analysis. The proposed approach achieved superior performance over conventional methods with respect to accuracy and the computational costs for various models and datasets. We provide analysis results for behavior of the proposed criterion on different datasets to demonstrate its favorable properties for channel pruning.
△ Less
Submitted 20 August, 2019; v1 submitted 14 June, 2018;
originally announced June 2018.