Search | arXiv e-print repository

Flexible Variational Information Bottleneck: Achieving Diverse Compression with a Single Training

Authors: Sota Kudo, Naoaki Ono, Shigehiko Kanaya, Ming Huang

Abstract: Information Bottleneck (IB) is a widely used framework that enables the extraction of information related to a target random variable from a source random variable. In the objective function, IB controls the trade-off between data compression and predictiveness through the Lagrange multiplier $β$. Traditionally, to find the trade-off to be learned, IB requires a search for $β$ through multiple tra… ▽ More Information Bottleneck (IB) is a widely used framework that enables the extraction of information related to a target random variable from a source random variable. In the objective function, IB controls the trade-off between data compression and predictiveness through the Lagrange multiplier $β$. Traditionally, to find the trade-off to be learned, IB requires a search for $β$ through multiple training cycles, which is computationally expensive. In this study, we introduce Flexible Variational Information Bottleneck (FVIB), an innovative framework for classification task that can obtain optimal models for all values of $β$ with single, computationally efficient training. We theoretically demonstrate that across all values of reasonable $β$, FVIB can simultaneously maximize an approximation of the objective function for Variational Information Bottleneck (VIB), the conventional IB method. Then we empirically show that FVIB can learn the VIB objective as effectively as VIB. Furthermore, in terms of calibration performance, FVIB outperforms other IB and calibration methods by enabling continuous optimization of $β$. Our codes are available at https://github.com/sotakudo/fvib. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2312.13110 [pdf]

Pre-training of Molecular GNNs via Conditional Boltzmann Generator

Authors: Daiki Koge, Naoaki Ono, Shigehiko Kanaya

Abstract: Learning representations of molecular structures using deep learning is a fundamental problem in molecular property prediction tasks. Molecules inherently exist in the real world as three-dimensional structures; furthermore, they are not static but in continuous motion in the 3D Euclidean space, forming a potential energy surface. Therefore, it is desirable to generate multiple conformations in ad… ▽ More Learning representations of molecular structures using deep learning is a fundamental problem in molecular property prediction tasks. Molecules inherently exist in the real world as three-dimensional structures; furthermore, they are not static but in continuous motion in the 3D Euclidean space, forming a potential energy surface. Therefore, it is desirable to generate multiple conformations in advance and extract molecular representations using a 4D-QSAR model that incorporates multiple conformations. However, this approach is impractical for drug and material discovery tasks because of the computational cost of obtaining multiple conformations. To address this issue, we propose a pre-training method for molecular GNNs using an existing dataset of molecular conformations to generate a latent vector universal to multiple conformations from a 2D molecular graph. Our method, called Boltzmann GNN, is formulated by maximizing the conditional marginal likelihood of a conditional generative model for conformations generation. We show that our model has a better prediction performance for molecular properties than existing pre-training methods using molecular graphs and three-dimensional molecular structures. △ Less

Submitted 18 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: 4 pages

arXiv:2307.00623 [pdf]

Variational Autoencoding Molecular Graphs with Denoising Diffusion Probabilistic Model

Authors: Daiki Koge, Naoaki Ono, Shigehiko Kanaya

Abstract: In data-driven drug discovery, designing molecular descriptors is a very important task. Deep generative models such as variational autoencoders (VAEs) offer a potential solution by designing descriptors as probabilistic latent vectors derived from molecular structures. These models can be trained on large datasets, which have only molecular structures, and applied to transfer learning. Neverthele… ▽ More In data-driven drug discovery, designing molecular descriptors is a very important task. Deep generative models such as variational autoencoders (VAEs) offer a potential solution by designing descriptors as probabilistic latent vectors derived from molecular structures. These models can be trained on large datasets, which have only molecular structures, and applied to transfer learning. Nevertheless, the approximate posterior distribution of the latent vectors of the usual VAE assumes a simple multivariate Gaussian distribution with zero covariance, which may limit the performance of representing the latent features. To overcome this limitation, we propose a novel molecular deep generative model that incorporates a hierarchical structure into the probabilistic latent vectors. We achieve this by a denoising diffusion probabilistic model (DDPM). We demonstrate that our model can design effective molecular latent vectors for molecular property prediction from some experiments by small datasets on physical properties and activity. The results highlight the superior prediction performance and robustness of our model compared to existing approaches. △ Less

Submitted 22 August, 2023; v1 submitted 2 July, 2023; originally announced July 2023.

Comments: 2 pages. Short paper submitted to IEEE CIBCB 2023

arXiv:2207.09783 [pdf, other]

Cancer Subty** by Improved Transcriptomic Features Using Vector Quantized Variational Autoencoder

Authors: Zheng Chen, Ziwei Yang, Lingwei Zhu, Guang Shi, Kun Yue, Takashi Matsubara, Shigehiko Kanaya, MD Altaf-Ul-Amin

Abstract: Defining and separating cancer subtypes is essential for facilitating personalized therapy modality and prognosis of patients. The definition of subtypes has been constantly recalibrated as a result of our deepened understanding. During this recalibration, researchers often rely on clustering of cancer data to provide an intuitive visual reference that could reveal the intrinsic characteristics of… ▽ More Defining and separating cancer subtypes is essential for facilitating personalized therapy modality and prognosis of patients. The definition of subtypes has been constantly recalibrated as a result of our deepened understanding. During this recalibration, researchers often rely on clustering of cancer data to provide an intuitive visual reference that could reveal the intrinsic characteristics of subtypes. The data being clustered are often omics data such as transcriptomics that have strong correlations to the underlying biological mechanism. However, while existing studies have shown promising results, they suffer from issues associated with omics data: sample scarcity and high dimensionality. As such, existing methods often impose unrealistic assumptions to extract useful features from the data while avoiding overfitting to spurious correlations. In this paper, we propose to leverage a recent strong generative model, Vector Quantized Variational AutoEncoder (VQ-VAE), to tackle the data issues and extract informative latent features that are crucial to the quality of subsequent clustering by retaining only information relevant to reconstructing the input. VQ-VAE does not impose strict assumptions and hence its latent features are better representations of the input, capable of yielding superior clustering performance with any mainstream clustering method. Extensive experiments and medical analysis on multiple datasets comprising 10 distinct cancers demonstrate the VQ-VAE clustering results can significantly and robustly improve prognosis over prevalent subty** systems. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: 12 pages

arXiv:2204.03173 [pdf, other]

Automated Sleep Staging via Parallel Frequency-Cut Attention

Authors: Zheng Chen, Ziwei Yang, Lingwei Zhu, Wei Chen, Toshiyo Tamura, Naoaki Ono, MD Altaf-Ul-Amin, Shigehiko Kanaya, Ming Huang

Abstract: This paper proposes a novel framework for automatically capturing the time-frequency nature of electroencephalogram (EEG) signals of human sleep based on the authoritative sleep medicine guidance. The framework consists of two parts: the first part extracts informative features by partitioning the input EEG spectrograms into a sequence of time-frequency patches. The second part is constituted by a… ▽ More This paper proposes a novel framework for automatically capturing the time-frequency nature of electroencephalogram (EEG) signals of human sleep based on the authoritative sleep medicine guidance. The framework consists of two parts: the first part extracts informative features by partitioning the input EEG spectrograms into a sequence of time-frequency patches. The second part is constituted by an attention-based architecture to efficiently search for the correlation between partitioned time-frequency patches and defining factors of sleep stages in parallel. The proposed pipeline is validated on the Sleep Heart Health Study dataset with new state-of-the-art results for the stages wake, N2, and N3, obtaining respective F1 scores of 0.93, 0.88, and 0.87, with only EEG signals used. The proposed method also has a high inter-rater reliability of 0.80 kappa. We also visualize the correspondence between sleep staging decisions and features extracted by the proposed method, providing strong interpretability for our model. △ Less

Submitted 12 January, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: 10 pages, 9 figures

arXiv:2204.02278 [pdf, other]

Cancer Subty** via Embedded Unsupervised Learning on Transcriptomics Data

Authors: Ziwei Yang, Lingwei Zhu, Zheng Chen, Ming Huang, Naoaki Ono, MD Altaf-Ul-Amin, Shigehiko Kanaya

Abstract: Cancer is one of the deadliest diseases worldwide. Accurate diagnosis and classification of cancer subtypes are indispensable for effective clinical treatment. Promising results on automatic cancer subty** systems have been published recently with the emergence of various deep learning methods. However, such automatic systems often overfit the data due to the high dimensionality and scarcity. In… ▽ More Cancer is one of the deadliest diseases worldwide. Accurate diagnosis and classification of cancer subtypes are indispensable for effective clinical treatment. Promising results on automatic cancer subty** systems have been published recently with the emergence of various deep learning methods. However, such automatic systems often overfit the data due to the high dimensionality and scarcity. In this paper, we propose to investigate automatic subty** from an unsupervised learning perspective by directly constructing the underlying data distribution itself, hence sufficient data can be generated to alleviate the issue of overfitting. Specifically, we bypass the strong Gaussianity assumption that typically exists but fails in the unsupervised learning subty** literature due to small-sized samples by vector quantization. Our proposed method better captures the latent space features and models the cancer subtype manifestation on a molecular basis, as demonstrated by the extensive experimental results. △ Less

Submitted 2 April, 2022; originally announced April 2022.

Comments: 4 pages, accepted for EMBC 2022

arXiv:1905.04028 [pdf, other]

Demand and Welfare Analysis in Discrete Choice Models with Social Interactions

Authors: Debopam Bhattacharya, Pascaline Dupas, Shin Kanaya

Abstract: Many real-life settings of consumer-choice involve social interactions, causing targeted policies to have spillover-effects. This paper develops novel empirical tools for analyzing demand and welfare-effects of policy-interventions in binary choice settings with social interactions. Examples include subsidies for health-product adoption and vouchers for attending a high-achieving school. We establ… ▽ More Many real-life settings of consumer-choice involve social interactions, causing targeted policies to have spillover-effects. This paper develops novel empirical tools for analyzing demand and welfare-effects of policy-interventions in binary choice settings with social interactions. Examples include subsidies for health-product adoption and vouchers for attending a high-achieving school. We establish the connection between econometrics of large games and Brock-Durlauf-type interaction models, under both I.I.D. and spatially correlated unobservables. We develop new convergence results for associated beliefs and estimates of preference-parameters under increasing-domain spatial asymptotics. Next, we show that even with fully parametric specifications and unique equilibrium, choice data, that are sufficient for counterfactual demand-prediction under interactions, are insufficient for welfare-calculations. This is because distinct underlying mechanisms producing the same interaction coefficient can imply different welfare-effects and deadweight-loss from a policy-intervention. Standard index-restrictions imply distribution-free bounds on welfare. We illustrate our results using experimental data on mosquito-net adoption in rural Kenya. △ Less

Submitted 7 May, 2024; v1 submitted 10 May, 2019; originally announced May 2019.

arXiv:1903.01634 [pdf, ps, other]

Resonance width for a particle-core coupling model with a square-well potential

Authors: K. Hagino, H. Sagawa, S. Kanaya, A. Odahara

Abstract: We derive a simple formula for the width of a multi-channel resonance state. To this end, we use a deformed square-well potential and solve the coupled-channels equations. We obtain the $S$-matrix in the Breit-Wigner form, from which partial widths can be extracted. We apply the resultant formula to a deformed nucleus and discuss the behavior of partial width for an $s$-wave channel. We derive a simple formula for the width of a multi-channel resonance state. To this end, we use a deformed square-well potential and solve the coupled-channels equations. We obtain the $S$-matrix in the Breit-Wigner form, from which partial widths can be extracted. We apply the resultant formula to a deformed nucleus and discuss the behavior of partial width for an $s$-wave channel. △ Less

Submitted 5 December, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

Comments: 19 pages, 6 eps figures. To appear in Prog. Theor. Exp. Phys

Showing 1–8 of 8 results for author: Kanaya, S