Skip to main content

Showing 1–9 of 9 results for author: Toyoizumi, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.09821  [pdf, other

    cs.LG stat.ML

    A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness

    Authors: Yuri Kinoshita, Taro Toyoizumi

    Abstract: While neural networks can enjoy an outstanding flexibility and exhibit unprecedented performance, the mechanism behind their behavior is still not well-understood. To tackle this fundamental challenge, researchers have tried to restrict and manipulate some of their properties in order to gain new insights and better control on them. Especially, throughout the past few years, the concept of \emph{b… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  2. arXiv:2311.10431  [pdf, other

    cs.CL

    Causal Graph in Language Model Rediscovers Cortical Hierarchy in Human Narrative Processing

    Authors: Zhengqi He, Taro Toyoizumi

    Abstract: Understanding how humans process natural language has long been a vital research direction. The field of natural language processing (NLP) has recently experienced a surge in the development of powerful language models. These models have proven to be invaluable tools for studying another complex system known to process human language: the brain. Previous studies have demonstrated that the features… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 15 pages, 16 figures

  3. arXiv:2210.07041  [pdf, other

    cs.CL cs.AI

    Spontaneous Emerging Preference in Two-tower Language Model

    Authors: Zhengqi He, Taro Toyoizumi

    Abstract: The ever-growing size of the foundation language model has brought significant performance gains in various types of downstream tasks. With the existence of side-effects brought about by the large size of the foundation language model such as deployment cost, availability issues, and environmental cost, there is some interest in exploring other possible directions, such as a divide-and-conquer sch… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  4. arXiv:2101.02879  [pdf, other

    cs.AI cs.IT

    Progressive Interpretation Synthesis: Interpreting Task Solving by Quantifying Previously Used and Unused Information

    Authors: Zhengqi He, Taro Toyoizumi

    Abstract: A deep neural network is a good task solver, but it is difficult to make sense of its operation. People have different ideas about how to form the interpretation about its operation. We look at this problem from a new perspective where the interpretation of task solving is synthesized by quantifying how much and what previously unused information is exploited in addition to the information used to… ▽ More

    Submitted 12 August, 2022; v1 submitted 8 January, 2021; originally announced January 2021.

    Comments: This is the authors' final version, and the article has been accepted for publication in "Neural Computation"

  5. arXiv:2003.00470  [pdf

    stat.ML cs.CV cs.LG

    Dimensionality reduction to maximize prediction generalization capability

    Authors: Takuya Isomura, Taro Toyoizumi

    Abstract: Generalization of time series prediction remains an important open issue in machine learning, wherein earlier methods have either large generalization error or local minima. We develop an analytically solvable, unsupervised learning scheme that extracts the most informative components for predicting future inputs, termed predictive principal component analysis (PredPCA). Our scheme can effectively… ▽ More

    Submitted 20 January, 2022; v1 submitted 1 March, 2020; originally announced March 2020.

    Journal ref: Nature Machine Intelligence 3, 434-446 (2021)

  6. arXiv:1808.00668  [pdf, other

    stat.ML cs.LG

    On the achievability of blind source separation for high-dimensional nonlinear source mixtures

    Authors: Takuya Isomura, Taro Toyoizumi

    Abstract: For many years, a combination of principal component analysis (PCA) and independent component analysis (ICA) has been used for blind source separation (BSS). However, it remains unclear why these linear methods work well with real-world data that involve nonlinear source mixtures. This work theoretically validates that a cascade of linear PCA and ICA can solve a nonlinear BSS problem accurately --… ▽ More

    Submitted 13 December, 2020; v1 submitted 2 August, 2018; originally announced August 2018.

  7. arXiv:1701.07974  [pdf, other

    cs.LG cs.NE

    Reinforced stochastic gradient descent for deep neural network learning

    Authors: Hai** Huang, Taro Toyoizumi

    Abstract: Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the high-dimensional parameter space. Therefore, it is highly desirable to design an efficient algorithm to escape from these saddle points and reach a parameter re… ▽ More

    Submitted 22 November, 2017; v1 submitted 27 January, 2017; originally announced January 2017.

    Comments: 12 pages and 9 figures, nearly final version as a technical report

  8. arXiv:1608.03714  [pdf, ps, other

    cond-mat.dis-nn cond-mat.stat-mech cs.LG q-bio.NC

    Unsupervised feature learning from finite data by message passing: discontinuous versus continuous phase transition

    Authors: Hai** Huang, Taro Toyoizumi

    Abstract: Unsupervised neural network learning extracts hidden features from unlabeled training data. This is used as a pretraining step for further supervised learning in deep networks. Hence, understanding unsupervised learning is of fundamental importance. Here, we study the unsupervised learning from a finite number of data, based on the restricted Boltzmann machine learning. Our study inspires an effic… ▽ More

    Submitted 10 November, 2016; v1 submitted 12 August, 2016; originally announced August 2016.

    Comments: 8 pages, 7 figures (5 pages, 4 figures in the main text and 3 pages of appendix)

    Journal ref: Phys. Rev. E 94, 062310 (2016)

  9. arXiv:1502.00186  [pdf, ps, other

    cond-mat.stat-mech cs.LG q-bio.NC stat.ML

    Advanced Mean Field Theory of Restricted Boltzmann Machine

    Authors: Hai** Huang, Taro Toyoizumi

    Abstract: Learning in restricted Boltzmann machine is typically hard due to the computation of gradients of log-likelihood function. To describe the network state statistics of the restricted Boltzmann machine, we develop an advanced mean field theory based on the Bethe approximation. Our theory provides an efficient message passing based method that evaluates not only the partition function (free energy) b… ▽ More

    Submitted 1 May, 2015; v1 submitted 31 January, 2015; originally announced February 2015.

    Comments: 5 pages, 4 figures, accepted by Phys Rev E (Rapid Communication)

    Journal ref: Phys. Rev. E 91, 050101 (2015)