Search | arXiv e-print repository

Shower Separation in Five Dimensions for Highly Granular Calorimeters using Machine Learning

Authors: S. Lai, J. Utehs, A. Wilhahn, M. C. Fouz, O. Bach, E. Brianne, A. Ebrahimi, K. Gadow, P. Göttlicher, O. Hartbrich, D. Heuchel, A. Irles, K. Krüger, J. Kvasnicka, S. Lu, C. Neubüser, A. Provenza, M. Reinecke, F. Sefkow, S. Schuwalow, M. De Silva, Y. Sudo, H. L. Tran, L. Liu, R. Masuda , et al. (26 additional authors not shown)

Abstract: To achieve state-of-the-art jet energy resolution for Particle Flow, sophisticated energy clustering algorithms must be developed that can fully exploit available information to separate energy deposits from charged and neutral particles. Three published neural network-based shower separation models were applied to simulation and experimental data to measure the performance of the highly granular… ▽ More To achieve state-of-the-art jet energy resolution for Particle Flow, sophisticated energy clustering algorithms must be developed that can fully exploit available information to separate energy deposits from charged and neutral particles. Three published neural network-based shower separation models were applied to simulation and experimental data to measure the performance of the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL) technological prototype in distinguishing the energy deposited by a single charged and single neutral hadron for Particle Flow. The performance of models trained using only standard spatial and energy and charged track position information from an event was compared to models trained using timing information available from AHCAL, which is expected to improve sensitivity to shower development and, therefore, aid in clustering. Both simulation and experimental data were used to train and test the models and their performances were compared. The best-performing neural network achieved significantly superior event reconstruction when timing information was utilised in training for the case where the charged hadron had more energy than the neutral one, motivating temporally sensitive calorimeters. All models under test were observed to tend to allocate energy deposited by the more energetic of the two showers to the less energetic one. Similar shower reconstruction performance was observed for a model trained on simulation and applied to data and a model trained and applied to data. △ Less

Submitted 28 June, 2024; originally announced July 2024.

arXiv:2405.10692 [pdf, other]

STOLAS: STOchastic LAttice Simulation of cosmic inflation

Authors: Yurino Mizuguchi, Tomoaki Murata, Yuichiro Tada

Abstract: We develop a C++ package of the STOchastic LAttice Simulation (STOLAS) of cosmic inflation. It performs the numerical lattice simulation in the application of the stochastic-$δN$ formalism. STOLAS can directly compute the three-dimensional map of the observable curvature perturbation without estimating its statistical properties. In its application to two toy models of inflation, chaotic inflation… ▽ More We develop a C++ package of the STOchastic LAttice Simulation (STOLAS) of cosmic inflation. It performs the numerical lattice simulation in the application of the stochastic-$δN$ formalism. STOLAS can directly compute the three-dimensional map of the observable curvature perturbation without estimating its statistical properties. In its application to two toy models of inflation, chaotic inflation and Starobinsky's linear-potential inflation, we confirm that STOLAS is well-consistent with the standard perturbation theory. Furthermore, by introducing the importance sampling technique, we have success in numerically sampling the current abundance of primordial black holes in a non-perturbative way. The package is available at https://github.com/STOchasticLAtticeSimulation/STOLAS_dist. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 31 pages, 10 figures

arXiv:2404.09207 [pdf, other]

DEGNN: Dual Experts Graph Neural Network Handling Both Edge and Node Feature Noise

Authors: Tai Hasegawa, Sukwon Yun, Xin Liu, Yin Jun Phua, Tsuyoshi Murata

Abstract: Graph Neural Networks (GNNs) have achieved notable success in various applications over graph data. However, recent research has revealed that real-world graphs often contain noise, and GNNs are susceptible to noise in the graph. To address this issue, several Graph Structure Learning (GSL) models have been introduced. While GSL models are tailored to enhance robustness against edge noise through… ▽ More Graph Neural Networks (GNNs) have achieved notable success in various applications over graph data. However, recent research has revealed that real-world graphs often contain noise, and GNNs are susceptible to noise in the graph. To address this issue, several Graph Structure Learning (GSL) models have been introduced. While GSL models are tailored to enhance robustness against edge noise through edge reconstruction, a significant limitation surfaces: their high reliance on node features. This inherent dependence amplifies their susceptibility to noise within node features. Recognizing this vulnerability, we present DEGNN, a novel GNN model designed to adeptly mitigate noise in both edges and node features. The core idea of DEGNN is to design two separate experts: an edge expert and a node feature expert. These experts utilize self-supervised learning techniques to produce modified edges and node features. Leveraging these modified representations, DEGNN subsequently addresses downstream tasks, ensuring robustness against noise present in both edges and node features of real-world graphs. Notably, the modification process can be trained end-to-end, empowering DEGNN to adjust dynamically and achieves optimal edge and node representations for specific tasks. Comprehensive experiments demonstrate DEGNN's efficacy in managing noise, both in original real-world graphs and in graphs with synthetic noise. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: PAKDD 2024, the code is available at https://github.com/TaiHasegawa/DEGNN

arXiv:2404.03200 [pdf, other]

Future-Proofing Class Incremental Learning

Authors: Quentin Jodelet, Xin Liu, Yin Jun Phua, Tsuyoshi Murata

Abstract: Exemplar-Free Class Incremental Learning is a highly challenging setting where replay memory is unavailable. Methods relying on frozen feature extractors have drawn attention recently in this setting due to their impressive performances and lower computational costs. However, those methods are highly dependent on the data used to train the feature extractor and may struggle when an insufficient am… ▽ More Exemplar-Free Class Incremental Learning is a highly challenging setting where replay memory is unavailable. Methods relying on frozen feature extractors have drawn attention recently in this setting due to their impressive performances and lower computational costs. However, those methods are highly dependent on the data used to train the feature extractor and may struggle when an insufficient amount of classes are available during the first incremental step. To overcome this limitation, we propose to use a pre-trained text-to-image diffusion model in order to generate synthetic images of future classes and use them to train the feature extractor. Experiments on the standard benchmarks CIFAR100 and ImageNet-Subset demonstrate that our proposed method can be used to improve state-of-the-art methods for exemplar-free class incremental learning, especially in the most difficult settings where the first incremental step only contains few classes. Moreover, we show that using synthetic samples of future classes achieves higher performance than using real data from different classes, paving the way for better and less costly pre-training methods for incremental learning. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2403.04632 [pdf, other]

Software Compensation for Highly Granular Calorimeters using Machine Learning

Authors: S. Lai, J. Utehs, A. Wilhahn, O. Bach, E. Brianne, A. Ebrahimi, K. Gadow, P. Göttlicher, O. Hartbrich, D. Heuchel, A. Irles, K. Krüger, J. Kvasnicka, S. Lu, C. Neubüser, A. Provenza, M. Reinecke, F. Sefkow, S. Schuwalow, M. De Silva, Y. Sudo, H. L. Tran, E. Buhmann, E. Garutti, S. Huck , et al. (39 additional authors not shown)

Abstract: A neural network for software compensation was developed for the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL). The neural network uses spatial and temporal event information from the AHCAL and energy information, which is expected to improve sensitivity to shower development and the neutron fraction of the hadron shower. The neural network method produced a depth-dependent energy w… ▽ More A neural network for software compensation was developed for the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL). The neural network uses spatial and temporal event information from the AHCAL and energy information, which is expected to improve sensitivity to shower development and the neutron fraction of the hadron shower. The neural network method produced a depth-dependent energy weighting and a time-dependent threshold for enhancing energy deposits consistent with the timescale of evaporation neutrons. Additionally, it was observed to learn an energy-weighting indicative of longitudinal leakage correction. In addition, the method produced a linear detector response and outperformed a published control method regarding resolution for every particle energy studied. △ Less

Submitted 7 March, 2024; originally announced March 2024.

arXiv:2311.14354 [pdf, other]

Modularity-based selection of the number of slices in temporal network clustering

Authors: Patrik Seiron, Axel Lindegren, Matteo Magnani, Christian Rohner, Tsuyoshi Murata, Petter Holme

Abstract: A popular way to cluster a temporal network is to transform it into a sequence of networks, also called slices, where each slice corresponds to a time interval and contains the vertices and edges existing in that interval. A reason to perform this transformation is that after a network has been sliced, existing algorithms designed to find clusters in multilayer networks can be used. However, to us… ▽ More A popular way to cluster a temporal network is to transform it into a sequence of networks, also called slices, where each slice corresponds to a time interval and contains the vertices and edges existing in that interval. A reason to perform this transformation is that after a network has been sliced, existing algorithms designed to find clusters in multilayer networks can be used. However, to use this approach, we need to know how many slices to generate. This chapter discusses how to select the number of slices when generalized modularity is used to identify the clusters. △ Less

Submitted 24 November, 2023; originally announced November 2023.

Journal ref: Temporal Network Theory (2nd ed.), Petter Holme and Jari Saramaki, eds., (Springer, Cham, 2023), pp. 435-447

arXiv:2310.03551 [pdf, other]

Parity-violating scalar trispectrum from a rolling axion during inflation

Authors: Tomohiro Fujita, Tomoaki Murata, Ippei Obata, Maresuke Shiraishi

Abstract: We study a mechanism of generating the trispectrum (4-point correlation) of curvature perturbation through the dynamics of a spectator axion field and U(1) gauge field during inflation. Owing to the Chern-Simons coupling, only one helicity mode of gauge field experiences a tachyonic instability and sources scalar perturbations. Sourced curvature perturbation exhibits parity-violating nature which… ▽ More We study a mechanism of generating the trispectrum (4-point correlation) of curvature perturbation through the dynamics of a spectator axion field and U(1) gauge field during inflation. Owing to the Chern-Simons coupling, only one helicity mode of gauge field experiences a tachyonic instability and sources scalar perturbations. Sourced curvature perturbation exhibits parity-violating nature which can be tested through its trispectrum. We numerically compute parity-even and parity-odd component of the sourced trispectrum. It is found that the ratio of parity-odd to parity-even mode can reach O(10%) in an exact equilateral momentum configuration. We also investigate a quasi-equilateral shape where only one of the momenta is slightly longer than the other three, and find that the parity-odd mode can reach, and more interestingly, surpass the parity-even one. This may help us to interpret a large parity-odd trispectrum signal extracted from BOSS galaxy-clustering data. △ Less

Submitted 19 March, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: 23 pages, 11 figures

Report number: RUP-23-20

arXiv:2307.00181 [pdf, other]

Influence maximization on temporal networks: a review

Authors: Eric Yanchenko, Tsuyoshi Murata, Petter Holme

Abstract: Influence maximization (IM) is an important topic in network science where a small seed set is chosen to maximize the spread of influence on a network. Recently, this problem has attracted attention on temporal networks where the network structure changes with time. IM on such dynamically varying networks is the topic of this review. We first categorize methods into two main paradigms: single and… ▽ More Influence maximization (IM) is an important topic in network science where a small seed set is chosen to maximize the spread of influence on a network. Recently, this problem has attracted attention on temporal networks where the network structure changes with time. IM on such dynamically varying networks is the topic of this review. We first categorize methods into two main paradigms: single and multiple seeding. In single seeding, nodes activate at the beginning of the diffusion process, and most methods either efficiently estimate the influence spread and select nodes with a greedy algorithm, or use a node-ranking heuristic. Nodes activate at different time points in the multiple seeding problem, via either sequential seeding, maintenance seeding or node probing paradigms. Throughout this review, we give special attention to deploying these algorithms in practice while also discussing existing solutions for real-world applications. We conclude by sharing important future research directions and challenges. △ Less

Submitted 30 June, 2023; originally announced July 2023.

arXiv:2306.17560 [pdf, other]

Class-Incremental Learning using Diffusion Model for Distillation and Replay

Authors: Quentin Jodelet, Xin Liu, Yin Jun Phua, Tsuyoshi Murata

Abstract: Class-incremental learning aims to learn new classes in an incremental fashion without forgetting the previously learned ones. Several research works have shown how additional data can be used by incremental models to help mitigate catastrophic forgetting. In this work, following the recent breakthrough in text-to-image generative models and their wide distribution, we propose the use of a pretrai… ▽ More Class-incremental learning aims to learn new classes in an incremental fashion without forgetting the previously learned ones. Several research works have shown how additional data can be used by incremental models to help mitigate catastrophic forgetting. In this work, following the recent breakthrough in text-to-image generative models and their wide distribution, we propose the use of a pretrained Stable Diffusion model as a source of additional data for class-incremental learning. Compared to competitive methods that rely on external, often unlabeled, datasets of real images, our approach can generate synthetic samples belonging to the same classes as the previously encountered images. This allows us to use those additional data samples not only in the distillation loss but also for replay in the classification loss. Experiments on the competitive benchmarks CIFAR100, ImageNet-Subset, and ImageNet demonstrate how this new approach can be used to further improve the performance of state-of-the-art methods for class-incremental learning on large scale datasets. △ Less

Submitted 9 October, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

Comments: Best paper award at 1st Workshop on Visual Continual Learning, ICCV 2023

arXiv:2305.09965 [pdf, other]

Link prediction for ex ante influence maximization on temporal networks

Authors: Eric Yanchenko, Tsuyoshi Murata, Petter Holme

Abstract: Influence maximization (IM) is the task of finding the most important nodes in order to maximize the spread of influence or information on a network. This task is typically studied on static or temporal networks where the complete topology of the graph is known. In practice, however, the seed nodes must be selected before observing the future evolution of the network. In this work, we consider thi… ▽ More Influence maximization (IM) is the task of finding the most important nodes in order to maximize the spread of influence or information on a network. This task is typically studied on static or temporal networks where the complete topology of the graph is known. In practice, however, the seed nodes must be selected before observing the future evolution of the network. In this work, we consider this realistic ex ante setting where $p$ time steps of the network have been observed before selecting the seed nodes. Then the influence is calculated after the network continues to evolve for a total of $T>p$ time steps. We address this problem by using statistical, non-negative matrix factorization and graph neural networks link prediction algorithms to predict the future evolution of the network and then apply existing influence maximization algorithms on the predicted networks. Additionally, the output of the link prediction methods can be used to construct novel IM algorithms. We apply the proposed methods to eight real-world and synthetic networks to compare their performance using the Susceptible-Infected (SI) diffusion model. We demonstrate that it is possible to construct quality seed sets in the ex ante setting as we achieve influence spread within 87\% of the optimal spread on seven of eight network. In many settings, choosing seed nodes based only historical edges provides results comparable to the results treating the future graph snapshots as known. The proposed heuristics based on the link prediction model are also some of the best-performing methods. These findings indicate that, for these eight networks under the SI model, the latent process which determines the most influential nodes may not have large temporal variation. Thus, knowing the future status of the network is not necessary to obtain good results for ex ante IM. △ Less

Submitted 12 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2302.03884 [pdf, other]

DIFF2: Differential Private Optimization via Gradient Differences for Nonconvex Distributed Learning

Authors: Tomoya Murata, Taiji Suzuki

Abstract: Differential private optimization for nonconvex smooth objective is considered. In the previous work, the best known utility bound is $\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP}))$ in terms of the squared full gradient norm, which is achieved by Differential Private Gradient Descent (DP-GD) as an instance, where $n$ is the sample size, $d$ is the problem dimensionality and… ▽ More Differential private optimization for nonconvex smooth objective is considered. In the previous work, the best known utility bound is $\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP}))$ in terms of the squared full gradient norm, which is achieved by Differential Private Gradient Descent (DP-GD) as an instance, where $n$ is the sample size, $d$ is the problem dimensionality and $\varepsilon_\mathrm{DP}$ is the differential privacy parameter. To improve the best known utility bound, we propose a new differential private optimization framework called \emph{DIFF2 (DIFFerential private optimization via gradient DIFFerences)} that constructs a differential private global gradient estimator with possibly quite small variance based on communicated \emph{gradient differences} rather than gradients themselves. It is shown that DIFF2 with a gradient descent subroutine achieves the utility of $\widetilde O(d^{2/3}/(n\varepsilon_\mathrm{DP})^{4/3})$, which can be significantly better than the previous one in terms of the dependence on the sample size $n$. To the best of our knowledge, this is the first fundamental result to improve the standard utility $\widetilde O(\sqrt{d}/(n\varepsilon_\mathrm{DP}))$ for nonconvex objectives. Additionally, a more computational and communication efficient subroutine is combined with DIFF2 and its theoretical analysis is also given. Numerical experiments are conducted to validate the superiority of DIFF2 framework. △ Less

Submitted 3 June, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

Comments: 26 pages

arXiv:2211.09489 [pdf, other]

doi 10.1103/PhysRevD.107.043508

How does SU($N$)-natural inflation isotropize the universe?

Authors: Tomoaki Murata, Tomohiro Fujita, Tsutomu Kobayashi

Abstract: We study the homogeneous and anisotropic dynamics of pseudoscalar inflation coupled to an SU($N$) gauge field. To see how the initially anisotropic universe is isotropized in such an inflation model, we derive the equations to obtain axisymmetric SU($N$) gauge field configurations in Bianchi type-I geometry and discuss a method to identify their isotropic subsets which are the candidates of their… ▽ More We study the homogeneous and anisotropic dynamics of pseudoscalar inflation coupled to an SU($N$) gauge field. To see how the initially anisotropic universe is isotropized in such an inflation model, we derive the equations to obtain axisymmetric SU($N$) gauge field configurations in Bianchi type-I geometry and discuss a method to identify their isotropic subsets which are the candidates of their late-time attractor. Each isotropic solution is characterized by the corresponding SU(2) subalgebra of the SU($N$) algebra. It is shown numerically that the isotropic universe is a universal late-time attractor in the case of the SU(3) gauge field. Interestingly, we find that a transition between the two distinct gauge-field configurations characterized by different SU(2) subalgebras can occur during inflation. We clarify the conditions for this to occur. This transition could leave an observable imprint on the CMB and the primordial gravitational wave background. △ Less

Submitted 9 February, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

Comments: 18 pages, 11 figures

Report number: RUP-22-23

Journal ref: Phys. Rev. D 107, 043508 (2023)

arXiv:2210.13137 [pdf, ps, other]

Maps to toric varieties and toric degenerations

Authors: Takuya Murata, Lara Bossinger

Abstract: We study and construct maps to toric varieties. In the process, we generalize torus embeddings to the non-projective case. Moreover, we give an analog of Cox's construction of toric varieties as GIT quotients of affine spaces for the non-normal case after T. Kajiwara. The main focus of the paper is an application to toric degenerations, (proper) families whose special fibers are not-necessarily-… ▽ More We study and construct maps to toric varieties. In the process, we generalize torus embeddings to the non-projective case. Moreover, we give an analog of Cox's construction of toric varieties as GIT quotients of affine spaces for the non-normal case after T. Kajiwara. The main focus of the paper is an application to toric degenerations, (proper) families whose special fibers are not-necessarily-normal toric varieties. We give a negative answer to a question of I. Dolgachev and K. Kaveh as to whether a toric degeneration can be constructed as a degeneration by projection. In the classical topology over the complex numbers, we recover an alternative construction of integral systems as was done by Harada--Kaveh in M. Harada and K. Kaveh. "Integrable systems, toric degenerations and okounkov bodie" using a deformation retract. In particular, we have an analogue of a moment map from a variety admitting a toric degeneration to its Newton--Okounkov polytope. △ Less

Submitted 1 July, 2024; v1 submitted 24 October, 2022; originally announced October 2022.

Comments: Section 1 has been rewritten. Submitted to a journal

MSC Class: 14D06 (Primary); 37J35 (Secondary); 14M25 (Secondary)

arXiv:2209.00361 [pdf, other]

Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning

Authors: Kazusato Oko, Shunta Akiyama, Tomoya Murata, Taiji Suzuki

Abstract: While variance reduction methods have shown great success in solving large scale optimization problems, many of them suffer from accumulated errors and, therefore, should periodically require the full gradient computation. In this paper, we present a single-loop algorithm named SLEDGE (Single-Loop mEthoD for Gradient Estimator) for finite-sum nonconvex optimization, which does not require periodic… ▽ More While variance reduction methods have shown great success in solving large scale optimization problems, many of them suffer from accumulated errors and, therefore, should periodically require the full gradient computation. In this paper, we present a single-loop algorithm named SLEDGE (Single-Loop mEthoD for Gradient Estimator) for finite-sum nonconvex optimization, which does not require periodic refresh of the gradient estimator but achieves nearly optimal gradient complexity. Unlike existing methods, SLEDGE has the advantage of versatility; (i) second-order optimality, (ii) exponential convergence in the PL region, and (iii) smaller complexity under less heterogeneity of data. We build an efficient federated learning algorithm by exploiting these favorable properties. We show the first and second-order optimality of the output and also provide analysis under PL conditions. When the local budget is sufficiently large and clients are less (Hessian-)~heterogeneous, the algorithm requires fewer communication rounds then existing methods such as FedAvg, SCAFFOLD, and Mime. The superiority of our method is verified in numerical experiments. △ Less

Submitted 4 October, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

arXiv:2207.00815 [pdf, other]

Simulating reaction time for Eureka effect in visual object recognition using artificial neural network

Authors: Kazufumi Hosoda, Shigeto Seno, Tsutomu Murata

Abstract: The human brain can recognize objects hidden in even severely degraded images after observing them for a while, which is known as a type of Eureka effect, possibly associated with human creativity. A previous psychological study suggests that the basis of this "Eureka recognition" is neural processes of coincidence of multiple stochastic activities. Here we constructed an artificial-neural-network… ▽ More The human brain can recognize objects hidden in even severely degraded images after observing them for a while, which is known as a type of Eureka effect, possibly associated with human creativity. A previous psychological study suggests that the basis of this "Eureka recognition" is neural processes of coincidence of multiple stochastic activities. Here we constructed an artificial-neural-network-based model that simulated the characteristics of the human Eureka recognition. △ Less

Submitted 30 June, 2022; originally announced July 2022.

Comments: 2 pages, 2 figures

arXiv:2207.00107 [pdf, ps, other]

doi 10.1007/978-3-319-73198-8_11

Modularity Optimization as a Training Criterion for Graph Neural Networks

Authors: Tsuyoshi Murata, Naveed Afzal

Abstract: Graph convolution is a recent scalable method for performing deep feature learning on attributed graphs by aggregating local node information over multiple layers. Such layers only consider attribute information of node neighbors in the forward model and do not incorporate knowledge of global network structure in the learning task. In particular, the modularity function provides a convenient sourc… ▽ More Graph convolution is a recent scalable method for performing deep feature learning on attributed graphs by aggregating local node information over multiple layers. Such layers only consider attribute information of node neighbors in the forward model and do not incorporate knowledge of global network structure in the learning task. In particular, the modularity function provides a convenient source of information about the community structure of networks. In this work we investigate the effect on the quality of learned representations by the incorporation of community structure preservation objectives of networks in the graph convolutional model. We incorporate the objectives in two ways, through an explicit regularization term in the cost function in the output layer and as an additional loss term computed via an auxiliary layer. We report the effect of community structure preserving terms in the graph convolutional architectures. Experimental evaluation on two attributed bibilographic networks showed that the incorporation of the community-preserving objective improves semi-supervised node classification accuracy in the sparse label regime. △ Less

Submitted 30 June, 2022; originally announced July 2022.

Comments: CompleNet 2018

Journal ref: Complex Networks IX pp 123-135 (2018)

arXiv:2204.09232 [pdf, ps, other]

Visual-based Positioning and Pose Estimation

Authors: Somnuk Phon-Amnuaisuk, Ken T. Murata, La-Or Kovavisaruch, Tiong-Hoo Lim, Praphan Pavarangkoon, Takamichi Mizuhara

Abstract: Recent advances in deep learning and computer vision offer an excellent opportunity to investigate high-level visual analysis tasks such as human localization and human pose estimation. Although the performance of human localization and human pose estimation has significantly improved in recent reports, they are not perfect and erroneous localization and pose estimation can be expected among video… ▽ More Recent advances in deep learning and computer vision offer an excellent opportunity to investigate high-level visual analysis tasks such as human localization and human pose estimation. Although the performance of human localization and human pose estimation has significantly improved in recent reports, they are not perfect and erroneous localization and pose estimation can be expected among video frames. Studies on the integration of these techniques into a generic pipeline that is robust to noise introduced from those errors are still lacking. This paper fills the missing study. We explored and developed two working pipelines that suited the visual-based positioning and pose estimation tasks. Analyses of the proposed pipelines were conducted on a badminton game. We showed that the concept of tracking by detection could work well, and errors in position and pose could be effectively handled by a linear interpolation technique using information from nearby frames. The results showed that the Visual-based Positioning and Pose Estimation could deliver position and pose estimations with good spatial and temporal resolutions. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: This paper is the expanded version of our paper titled Visual-based Positioning and Pose Estimation, ICONIP (4) 2020: 410-417

arXiv:2202.06083 [pdf, other]

Esca** Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning

Authors: Tomoya Murata, Taiji Suzuki

Abstract: In recent centralized nonconvex distributed learning and federated learning, local methods are one of the promising approaches to reduce communication time. However, existing work has mainly focused on studying first-order optimality guarantees. On the other side, second-order optimality guaranteed algorithms, i.e., algorithms esca** saddle points, have been extensively studied in the non-distri… ▽ More In recent centralized nonconvex distributed learning and federated learning, local methods are one of the promising approaches to reduce communication time. However, existing work has mainly focused on studying first-order optimality guarantees. On the other side, second-order optimality guaranteed algorithms, i.e., algorithms esca** saddle points, have been extensively studied in the non-distributed optimization literature. In this paper, we study a new local algorithm called Bias-Variance Reduced Local Perturbed SGD (BVR-L-PSGD), that combines the existing bias-variance reduced gradient estimator with parameter perturbation to find second-order optimal points in centralized nonconvex distributed optimization. BVR-L-PSGD enjoys second-order optimality with nearly the same communication complexity as the best known one of BVR-L-SGD to find first-order optimality. Particularly, the communication complexity is better than non-local methods when the local datasets heterogeneity is smaller than the smoothness of the local loss. In an extreme case, the communication complexity approaches to $\widetilde Θ(1)$ when the local datasets heterogeneity goes to zero. Numerical results validate our theoretical findings. △ Less

Submitted 12 October, 2022; v1 submitted 12 February, 2022; originally announced February 2022.

Comments: 50 pages

arXiv:2112.05914 [pdf, other]

Lea** Through Time with Gradient-based Adaptation for Recommendation

Authors: Nuttapong Chairatanakul, Hoang NT, Xin Liu, Tsuyoshi Murata

Abstract: Modern recommender systems are required to adapt to the change in user preferences and item popularity. Such a problem is known as the temporal dynamics problem, and it is one of the main challenges in recommender system modeling. Different from the popular recurrent modeling approach, we propose a new solution named LeapRec to the temporal dynamic problem by using trajectory-based meta-learning t… ▽ More Modern recommender systems are required to adapt to the change in user preferences and item popularity. Such a problem is known as the temporal dynamics problem, and it is one of the main challenges in recommender system modeling. Different from the popular recurrent modeling approach, we propose a new solution named LeapRec to the temporal dynamic problem by using trajectory-based meta-learning to model time dependencies. LeapRec characterizes temporal dynamics by two complement components named global time leap (GTL) and ordered time leap (OTL). By design, GTL learns long-term patterns by finding the shortest learning path across unordered temporal data. Cooperatively, OTL learns short-term patterns by considering the sequential nature of the temporal data. Our experimental results show that LeapRec consistently outperforms the state-of-the-art methods on several datasets and recommendation metrics. Furthermore, we provide an empirical study of the interaction between GTL and OTL, showing the effects of long- and short-term modeling. △ Less

Submitted 28 December, 2021; v1 submitted 10 December, 2021; originally announced December 2021.

Comments: Accepted by AAAI-2022. Preprint version

arXiv:2111.06748 [pdf, other]

Simplifying approach to Node Classification in Graph Neural Networks

Authors: Sunil Kumar Maurya, Xin Liu, Tsuyoshi Murata

Abstract: Graph Neural Networks have become one of the indispensable tools to learn from graph-structured data, and their usefulness has been shown in wide variety of tasks. In recent years, there have been tremendous improvements in architecture design, resulting in better performance on various prediction tasks. In general, these neural architectures combine node feature aggregation and feature transforma… ▽ More Graph Neural Networks have become one of the indispensable tools to learn from graph-structured data, and their usefulness has been shown in wide variety of tasks. In recent years, there have been tremendous improvements in architecture design, resulting in better performance on various prediction tasks. In general, these neural architectures combine node feature aggregation and feature transformation using learnable weight matrix in the same layer. This makes it challenging to analyze the importance of node features aggregated from various hops and the expressiveness of the neural network layers. As different graph datasets show varying levels of homophily and heterophily in features and class label distribution, it becomes essential to understand which features are important for the prediction tasks without any prior information. In this work, we decouple the node feature aggregation step and depth of graph neural network, and empirically analyze how different aggregated features play a role in prediction performance. We show that not all features generated via aggregation steps are useful, and often using these less informative features can be detrimental to the performance of the GNN model. Through our experiments, we show that learning certain subsets of these features can lead to better performance on wide variety of datasets. We propose to use softmax as a regularizer and "soft-selector" of features aggregated from neighbors at different hop distances; and L2-Normalization over GNN layers. Combining these techniques, we present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model achieves comparable or even higher accuracy than state-of-the-art GNN models in nine benchmark datasets for the node classification task, with remarkable improvements up to 51.1%. △ Less

Submitted 12 November, 2021; originally announced November 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2105.07634

arXiv:2110.09006 [pdf, other]

Natural Image Reconstruction from fMRI using Deep Learning: A Survey

Authors: Zarina Rakhimberdina, Quentin Jodelet, Xin Liu, Tsuyoshi Murata

Abstract: With the advent of brain imaging techniques and machine learning tools, much effort has been devoted to building computational models to capture the encoding of visual information in the human brain. One of the most challenging brain decoding tasks is the accurate reconstruction of the perceived natural images from brain activities measured by functional magnetic resonance imaging (fMRI). In this… ▽ More With the advent of brain imaging techniques and machine learning tools, much effort has been devoted to building computational models to capture the encoding of visual information in the human brain. One of the most challenging brain decoding tasks is the accurate reconstruction of the perceived natural images from brain activities measured by functional magnetic resonance imaging (fMRI). In this work, we survey the most recent deep learning methods for natural image reconstruction from fMRI. We examine these methods in terms of architectural design, benchmark datasets, and evaluation metrics and present a fair performance evaluation across standardized evaluation metrics. Finally, we discuss the strengths and limitations of existing studies and present potential future directions. △ Less

Submitted 24 November, 2021; v1 submitted 18 October, 2021; originally announced October 2021.

Comments: Accepted for publication in Frontiers in Neuroscience

Journal ref: https://www.frontiersin.org/articles/10.3389/fnins.2021.795488/abstract

arXiv:2109.04400 [pdf]

Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph

Authors: Nuttapong Chairatanakul, Noppayut Sriwatanasakdi, Nontawat Charoenphakdee, Xin Liu, Tsuyoshi Murata

Abstract: In cross-lingual text classification, it is required that task-specific training data in high-resource source languages are available, where the task is identical to that of a low-resource target language. However, collecting such training data can be infeasible because of the labeling cost, task characteristics, and privacy concerns. This paper proposes an alternative solution that uses only task… ▽ More In cross-lingual text classification, it is required that task-specific training data in high-resource source languages are available, where the task is identical to that of a low-resource target language. However, collecting such training data can be infeasible because of the labeling cost, task characteristics, and privacy concerns. This paper proposes an alternative solution that uses only task-independent word embeddings of high-resource languages and bilingual dictionaries. First, we construct a dictionary-based heterogeneous graph (DHG) from bilingual dictionaries. This opens the possibility to use graph neural networks for cross-lingual transfer. The remaining challenge is the heterogeneity of DHG because multiple languages are considered. To address this challenge, we propose dictionary-based heterogeneous graph neural network (DHGNet) that effectively handles the heterogeneity of DHG by two-step aggregations, which are word-level and language-level aggregations. Experimental results demonstrate that our method outperforms pretrained models even though it does not access to large corpora. Furthermore, it can perform well even though dictionaries contain many incorrect translations. Its robustness allows the usage of a wider range of dictionaries such as an automatically constructed dictionary and crowdsourced dictionary, which are convenient for real-world applications. △ Less

Submitted 9 September, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: Published in Findings of EMNLP 2021

arXiv:2107.07199 [pdf, other]

doi 10.1103/PhysRevD.104.083514

Dynamics of inflation with mutually orthogonal vector fields in a closed universe

Authors: Tomoaki Murata, Tsutomu Kobayashi

Abstract: We study the dynamics of a homogeneous, isotropic, and positively curved universe in the presence of a SU(2) gauge field or a triplet of mutually orthogonal vector fields. In the SU(2) case we use the previously known ansatz for the gauge-field configuration, but the case without non-abelian symmetries is more nontrivial and we develop a new ansatz. We in particular consider axion-SU(2) inflation… ▽ More We study the dynamics of a homogeneous, isotropic, and positively curved universe in the presence of a SU(2) gauge field or a triplet of mutually orthogonal vector fields. In the SU(2) case we use the previously known ansatz for the gauge-field configuration, but the case without non-abelian symmetries is more nontrivial and we develop a new ansatz. We in particular consider axion-SU(2) inflation and inflation with vector fields having U(1)$\times$U(1)$\times$U(1) symmetry, and analyze their dynamics in detail numerically. Novel effects of the spatial curvature come into play through vector fields, which causes unconventional pre-inflationary dynamics. It is found that the closed universe with vector fields is slightly more stable against collapse than that filled solely with an inflaton field. △ Less

Submitted 20 September, 2021; v1 submitted 15 July, 2021; originally announced July 2021.

Comments: 9 pages, 7 figures

Report number: RUP-21-12

Journal ref: Phys. Rev. D 104, 083514 (2021)

arXiv:2105.07634 [pdf, other]

Improving Graph Neural Networks with Simple Architecture Design

Authors: Sunil Kumar Maurya, Xin Liu, Tsuyoshi Murata

Abstract: Graph Neural Networks have emerged as a useful tool to learn on the data by applying additional constraints based on the graph structure. These graphs are often created with assumed intrinsic relations between the entities. In recent years, there have been tremendous improvements in the architecture design, pushing the performance up in various prediction tasks. In general, these neural architectu… ▽ More Graph Neural Networks have emerged as a useful tool to learn on the data by applying additional constraints based on the graph structure. These graphs are often created with assumed intrinsic relations between the entities. In recent years, there have been tremendous improvements in the architecture design, pushing the performance up in various prediction tasks. In general, these neural architectures combine layer depth and node feature aggregation steps. This makes it challenging to analyze the importance of features at various hops and the expressiveness of the neural network layers. As different graph datasets show varying levels of homophily and heterophily in features and class label distribution, it becomes essential to understand which features are important for the prediction tasks without any prior information. In this work, we decouple the node feature aggregation step and depth of graph neural network and introduce several key design strategies for graph neural networks. More specifically, we propose to use softmax as a regularizer and "Soft-Selector" of features aggregated from neighbors at different hop distances; and "Hop-Normalization" over GNN layers. Combining these techniques, we present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model outperforms other state of the art GNN models and achieves up to 64% improvements in accuracy on node classification tasks. Moreover, analyzing the learned soft-selection parameters of the model provides a simple way to study the importance of features in the prediction tasks. Finally, we demonstrate with experiments that the model is scalable for large graphs with millions of nodes and billions of edges. △ Less

Submitted 17 May, 2021; originally announced May 2021.

arXiv:2105.06259 [pdf, other]

doi 10.1088/1475-7516/2021/09/031

The isotropic attractor solution of axion-SU(2) inflation: Universal isotropization in Bianchi type-I geometry

Authors: Ira Wolfson, Azadeh Maleknejad, Tomoaki Murata, Eiichiro Komatsu, Tsutomu Kobayashi

Abstract: SU(2) gauge fields coupled to an axion field can acquire an isotropic background solution during inflation. We study homogeneous but anisotropic inflationary solutions in the presence of such (massless) gauge fields. A gauge field in the cosmological background may pose a threat to spatial isotropy. We show, however, that such models $\textit{generally}$ isotropize in Bianchi type-I geometry, and… ▽ More SU(2) gauge fields coupled to an axion field can acquire an isotropic background solution during inflation. We study homogeneous but anisotropic inflationary solutions in the presence of such (massless) gauge fields. A gauge field in the cosmological background may pose a threat to spatial isotropy. We show, however, that such models $\textit{generally}$ isotropize in Bianchi type-I geometry, and the isotropic solution is the attractor. Restricting the setup by adding an axial symmetry, we revisited the numerical analysis presented in Wolfson et.al (2020). We find that the reported numerical breakdown in the previous analysis is an artifact of parametrization singularity. We use a new parametrization that is well-defined all over the phase space. We show that the system respects the cosmic no-hair conjecture and the anisotropies always dilute away within a few e-folds. △ Less

Submitted 27 September, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

Comments: 23 pages, 14 figures. Published 23 September 2021 Updated to published version

Report number: CERN-TH-2021-076, RUP-21-8

Journal ref: JCAP09(2021)031

arXiv:2103.12532 [pdf, other]

doi 10.1016/j.cviu.2022.103582

Balanced softmax cross-entropy for incremental learning with and without memory

Authors: Quentin Jodelet, Xin Liu, Tsuyoshi Murata

Abstract: When incrementally trained on new classes, deep neural networks are subject to catastrophic forgetting which leads to an extreme deterioration of their performance on the old classes while learning the new ones. Using a small memory containing few samples from past classes has shown to be an effective method to mitigate catastrophic forgetting. However, due to the limited size of the replay memory… ▽ More When incrementally trained on new classes, deep neural networks are subject to catastrophic forgetting which leads to an extreme deterioration of their performance on the old classes while learning the new ones. Using a small memory containing few samples from past classes has shown to be an effective method to mitigate catastrophic forgetting. However, due to the limited size of the replay memory, there is a large imbalance between the number of samples for the new and the old classes in the training dataset resulting in bias in the final model. To address this issue, we propose to use the Balanced Softmax Cross-Entropy and show that it can be seamlessly combined with state-of-the-art approaches for class-incremental learning in order to improve their accuracy while also potentially decreasing the computational cost of the training procedure. We further extend this approach to the more demanding class-incremental learning without memory setting and achieve competitive results with memory-based approaches. Experiments on the challenging ImageNet, ImageNet-Subset and CIFAR100 benchmarks with various settings demonstrate the benefits of our approach. △ Less

Submitted 14 November, 2022; v1 submitted 23 March, 2021; originally announced March 2021.

Comments: Journal extension of the ICANN 2021 paper (arXiv:2103.12532v3), published in Computer Vision and Image Understanding

arXiv:2103.10783 [pdf]

doi 10.1016/j.nimb.2018.09.018

Investigation of alpha particle induced reactions on natural silver in the 40-50 MeV energy range

Authors: F. Ditrói, S. Takács, H. Haba, Y. Komori, M. Aikawa, M. Saito, T. Murata

Abstract: Natural silver targets have been irradiated by using a 50 MeV alpha-particle beam in order to measure the activation cross sections of radioisotopes in the 40-50 MeV energy range. Among the radio-products there are medically important isotopes such as $^{110m}$In and $^{111}$In. For optimizing the production of these radioisotopes and regarding their purity and specific activity the cross section… ▽ More Natural silver targets have been irradiated by using a 50 MeV alpha-particle beam in order to measure the activation cross sections of radioisotopes in the 40-50 MeV energy range. Among the radio-products there are medically important isotopes such as $^{110m}$In and $^{111}$In. For optimizing the production of these radioisotopes and regarding their purity and specific activity the cross section data for every produced radioisotope are important. New data are measured in this energy range and the results of some previous measurements have been confirmed. Physical yield curves have been calculated by using the new cross section data completed with the results from the literature. △ Less

Submitted 19 March, 2021; originally announced March 2021.

Journal ref: Nuclear Instruments and Methods B 436(2018)119-129

arXiv:2102.03198 [pdf, other]

Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning

Authors: Tomoya Murata, Taiji Suzuki

Abstract: Recently, local SGD has got much attention and been extensively studied in the distributed learning community to overcome the communication bottleneck problem. However, the superiority of local SGD to minibatch SGD only holds in quite limited situations. In this paper, we study a new local algorithm called Bias-Variance Reduced Local SGD (BVR-L-SGD) for nonconvex distributed optimization. Algorith… ▽ More Recently, local SGD has got much attention and been extensively studied in the distributed learning community to overcome the communication bottleneck problem. However, the superiority of local SGD to minibatch SGD only holds in quite limited situations. In this paper, we study a new local algorithm called Bias-Variance Reduced Local SGD (BVR-L-SGD) for nonconvex distributed optimization. Algorithmically, our proposed bias and variance reduced local gradient estimator fully utilizes small second-order heterogeneity of local objectives and suggests randomly picking up one of the local models instead of taking the average of them when workers are synchronized. Theoretically, under small heterogeneity of local objectives, we show that BVR-L-SGD achieves better communication complexity than both the previous non-local and local methods under mild conditions, and particularly BVR-L-SGD is the first method that breaks the barrier of communication complexity $Θ(1/\varepsilon)$ for general nonconvex smooth objectives when the heterogeneity is small and the local computation budget is large. Numerical results are given to verify the theoretical findings and give empirical evidence of the superiority of our method. △ Less

Submitted 13 June, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

Comments: 19 pages

arXiv:2011.10988 [pdf, other]

Stacked Graph Filter

Authors: Hoang NT, Takanori Maehara, Tsuyoshi Murata

Abstract: We study Graph Convolutional Networks (GCN) from the graph signal processing viewpoint by addressing a difference between learning graph filters with fully connected weights versus trainable polynomial coefficients. We find that by stacking graph filters with learnable polynomial parameters, we can build a highly adaptive and robust vertex classification model. Our treatment here relaxes the low-f… ▽ More We study Graph Convolutional Networks (GCN) from the graph signal processing viewpoint by addressing a difference between learning graph filters with fully connected weights versus trainable polynomial coefficients. We find that by stacking graph filters with learnable polynomial parameters, we can build a highly adaptive and robust vertex classification model. Our treatment here relaxes the low-frequency (or equivalently, high homophily) assumptions in existing vertex classification models, resulting a more ubiquitous solution in terms of spectral properties. Empirically, by using only one hyper-parameter setting, our model achieves strong results on most benchmark datasets across the frequency spectrum. △ Less

Submitted 22 November, 2020; originally announced November 2020.

Comments: Source code is provided at github.com/gear/sgf

arXiv:2010.12177 [pdf, other]

doi 10.1017/jfm.2021.697

Sparse identification of nonlinear dynamics with low-dimensionalized flow representations

Authors: Kai Fukami, Takaaki Murata, Kai Zhang, Koji Fukagata

Abstract: We perform a sparse identification of nonlinear dynamics (SINDy) for low-dimensionalized complex flow phenomena. We first apply the SINDy with two regression methods, the thresholded least square algorithm (TLSA) and the adaptive Lasso (Alasso) which show reasonable ability with a wide range of sparsity constant in our preliminary tests, to a two-dimensional single cylinder wake at $Re_D=100$, its… ▽ More We perform a sparse identification of nonlinear dynamics (SINDy) for low-dimensionalized complex flow phenomena. We first apply the SINDy with two regression methods, the thresholded least square algorithm (TLSA) and the adaptive Lasso (Alasso) which show reasonable ability with a wide range of sparsity constant in our preliminary tests, to a two-dimensional single cylinder wake at $Re_D=100$, its transient process, and a wake of two-parallel cylinders, as examples of high-dimensional fluid data. To handle these high dimensional data with SINDy whose library matrix is suitable for low-dimensional variable combinations, a convolutional neural network-based autoencoder (CNN-AE) is utilized. The CNN-AE is employed to map a high-dimensional dynamics into a low-dimensional latent space. The SINDy then seeks a governing equation of the mapped low-dimensional latent vector. Temporal evolution of high-dimensional dynamics can be provided by combining the predicted latent vector by SINDy with the CNN decoder which can remap the low-dimensional latent vector to the original dimension. The SINDy can provide a stable solution as the governing equation of the latent dynamics and the CNN-SINDy based modeling can reproduce high-dimensional flow fields successfully, although more terms are required to represent the transient flow and the two-parallel cylinder wake than the periodic shedding. A nine-equation turbulent shear flow model is finally considered to examine the applicability of SINDy to turbulence, although without using CNN-AE. The present results suggest that the proposed scheme with an appropriate parameter choice enables us to analyze high-dimensional nonlinear dynamics with interpretable low-dimensional manifolds. △ Less

Submitted 1 August, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

Journal ref: J. Fluid Mech. 926, A10 (2021)

arXiv:2009.04928 [pdf, ps, other]

Generic tropical initial ideals of Cohen-Macaulay algebras

Authors: Kiumars Kaveh, Christopher Manon, Takuya Murata

Abstract: We study the generic tropical initial ideals of a positively graded Cohen-Macaulay algebra $R$ over an algebraically closed field $\mathbf{k}$. Building on work of Römer and Schmitz, we give a formula for each initial ideal, and we express the associated quasivaluations in terms of certain $I$-adic filtrations. As a corollary, we show that in the case that $R$ is a domain, every initial ideal comi… ▽ More We study the generic tropical initial ideals of a positively graded Cohen-Macaulay algebra $R$ over an algebraically closed field $\mathbf{k}$. Building on work of Römer and Schmitz, we give a formula for each initial ideal, and we express the associated quasivaluations in terms of certain $I$-adic filtrations. As a corollary, we show that in the case that $R$ is a domain, every initial ideal coming from the codimension-$1$ skeleton of the tropical variety is prime, so "generic presentations of Cohen-Macaulay domains are well-poised in codimension-$1$." △ Less

Submitted 15 January, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 11 pages, to appear in J. Pure Appl. Algebra

MSC Class: 14T05; 14M05; 13A18

arXiv:2007.11230 [pdf, other]

MetAL: Active Semi-Supervised Learning on Graphs via Meta Learning

Authors: Kaushalya Madhawa, Tsuyoshi Murata

Abstract: The objective of active learning (AL) is to train classification models with less number of labeled instances by selecting only the most informative instances for labeling. The AL algorithms designed for other data types such as images and text do not perform well on graph-structured data. Although a few heuristics-based AL algorithms have been proposed for graphs, a principled approach is lacking… ▽ More The objective of active learning (AL) is to train classification models with less number of labeled instances by selecting only the most informative instances for labeling. The AL algorithms designed for other data types such as images and text do not perform well on graph-structured data. Although a few heuristics-based AL algorithms have been proposed for graphs, a principled approach is lacking. In this paper, we propose MetAL, an AL approach that selects unlabeled instances that directly improve the future performance of a classification model. For a semi-supervised learning problem, we formulate the AL task as a bilevel optimization problem. Based on recent work in meta-learning, we use the meta-gradients to approximate the impact of retraining the model with any unlabeled instance on the model performance. Using multiple graph datasets belonging to different domains, we demonstrate that MetAL efficiently outperforms existing state-of-the-art AL algorithms. △ Less

Submitted 22 July, 2020; originally announced July 2020.

Comments: 16 pages, 4 figures

arXiv:2007.04583 [pdf, other]

doi 10.1016/j.future.2020.11.016

Graph Convolutional Networks for Graphs Containing Missing Features

Authors: Hibiki Taguchi, Xin Liu, Tsuyoshi Murata

Abstract: Graph Convolutional Network (GCN) has experienced great success in graph analysis tasks. It works by smoothing the node features across the graph. The current GCN models overwhelmingly assume that the node feature information is complete. However, real-world graph data are often incomplete and containing missing features. Traditionally, people have to estimate and fill in the unknown features base… ▽ More Graph Convolutional Network (GCN) has experienced great success in graph analysis tasks. It works by smoothing the node features across the graph. The current GCN models overwhelmingly assume that the node feature information is complete. However, real-world graph data are often incomplete and containing missing features. Traditionally, people have to estimate and fill in the unknown features based on imputation techniques and then apply GCN. However, the process of feature filling and graph learning are separated, resulting in degraded and unstable performance. This problem becomes more serious when a large number of features are missing. We propose an approach that adapts GCN to graphs containing missing features. In contrast to traditional strategy, our approach integrates the processing of missing features and graph learning within the same neural network architecture. Our idea is to represent the missing data by Gaussian Mixture Model (GMM) and calculate the expected activation of neurons in the first hidden layer of GCN, while kee** the other layers of the network unchanged. This enables us to learn the GMM parameters and network weight parameters in an end-to-end manner. Notably, our approach does not increase the computational complexity of GCN and it is consistent with GCN when the features are complete. We demonstrate through extensive experiments that our approach significantly outperforms the imputation-based methods in node classification and link prediction tasks. We show that the performance of our approach for the case with a low level of missing features is even superior to GCN for the case with complete features. △ Less

Submitted 6 December, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

Journal ref: Future Generation Computer Systems, Volume 117, Pages 155-168, 2021

arXiv:2006.10925 [pdf, other]

Gradient Descent in RKHS with Importance Labeling

Authors: Tomoya Murata, Taiji Suzuki

Abstract: Labeling cost is often expensive and is a fundamental limitation of supervised learning. In this paper, we study importance labeling problem, in which we are given many unlabeled data and select a limited number of data to be labeled from the unlabeled data, and then a learning algorithm is executed on the selected one. We propose a new importance labeling scheme that can effectively select an inf… ▽ More Labeling cost is often expensive and is a fundamental limitation of supervised learning. In this paper, we study importance labeling problem, in which we are given many unlabeled data and select a limited number of data to be labeled from the unlabeled data, and then a learning algorithm is executed on the selected one. We propose a new importance labeling scheme that can effectively select an informative subset of unlabeled data in least squares regression in Reproducing Kernel Hilbert Spaces (RKHS). We analyze the generalization error of gradient descent combined with our labeling scheme and show that the proposed algorithm achieves the optimal rate of convergence in much wider settings and especially gives much better generalization ability in a small label noise setting than the usual uniform sampling scheme. Numerical experiments verify our theoretical findings. △ Less

Submitted 12 April, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: 18 pages, 14 figures

arXiv:2003.07548 [pdf, other]

doi 10.1007/s00162-020-00528-w

Machine-learning-based reduced order modeling for unsteady flows around bluff bodies of various shapes

Authors: Kazuto Hasegawa, Kai Fukami, Takaaki Murata, Koji Fukagata

Abstract: We propose a method to construct a reduced order model with machine learning for unsteady flows. The present machine-learned reduced order model (ML-ROM) is constructed by combining a convolutional neural network autoencoder (CNN-AE) and a long short-term memory (LSTM), which are trained in a sequential manner. First, the CNN-AE is trained using direct numerical simulation (DNS) data so as to map… ▽ More We propose a method to construct a reduced order model with machine learning for unsteady flows. The present machine-learned reduced order model (ML-ROM) is constructed by combining a convolutional neural network autoencoder (CNN-AE) and a long short-term memory (LSTM), which are trained in a sequential manner. First, the CNN-AE is trained using direct numerical simulation (DNS) data so as to map the high-dimensional flow data into low-dimensional latent space. Then, the LSTM is utilized to establish a temporal prediction system for the low-dimensionalized vectors obtained by CNN-AE. As a test case, we consider flows around a bluff body whose shape is defined using a combination of trigonometric functions with random amplitudes. The present ML-ROMs are trained on a set of 80 bluff body shapes and tested on a different set of 20 bluff body shapes not used for training, with both training and test shapes chosen from the same random distribution. The flow fields are confirmed to be well reproduced by the present ML-ROM in terms of various statistics. We also focus on the influence of two main parameters: (1) the latent vector size in the CNN-AE, and (2) the time step size between the mapped vectors used for the LSTM. The present results show that the ML-ROM works well even for unseen shapes of bluff bodies when these parameters are properly chosen, which implies great potential for the present type of ML-ROM to be applied to more complex flows △ Less

Submitted 17 March, 2020; originally announced March 2020.

Comments: 18 pages, 20 figures

Journal ref: Theor. Comput. Fluid Dyn. 34, 367-383 (2020)

arXiv:1906.04029 [pdf, other]

doi 10.1017/jfm.2019.822

Nonlinear mode decomposition with convolutional neural networks for fluid dynamics

Authors: Takaaki Murata, Kai Fukami, Koji Fukagata

Abstract: We present a new nonlinear mode decomposition method to visualize the decomposed flow fields, named the mode decomposing convolutional neural network autoencoder (MD-CNN-AE). The proposed method is applied to a flow around a circular cylinder at $Re_D=100$ as a test case. The flow attributes are mapped into two modes in the latent space and then these two modes are visualized in the physical space… ▽ More We present a new nonlinear mode decomposition method to visualize the decomposed flow fields, named the mode decomposing convolutional neural network autoencoder (MD-CNN-AE). The proposed method is applied to a flow around a circular cylinder at $Re_D=100$ as a test case. The flow attributes are mapped into two modes in the latent space and then these two modes are visualized in the physical space. Because the MD-CNN-AEs with nonlinear activation functions show lower reconstruction errors than the proper orthogonal decomposition (POD), the nonlinearity contained in the activation function is considered the key to improve the capability of the model. It is found by applying POD to each field decomposed using the MD-CNN-AE with hyperbolic tangent activation that a single nonlinear MD-CNN-AE mode contains multiple orthogonal bases, in contrast to the linear methods, i.e., POD and the MD-CNN-AE with linear activation. We further assess the proposed MD-CNN-AE by applying it to a transient process of a circular cylinder wake in order to examine its capability for flows containing high-order spatial modes. The present results suggest a great potential for the nonlinear MD-CNN-AE to be used for feature extraction of flow fields in lower dimension than POD, while retaining interpretable relationships with the conventional POD modes. △ Less

Submitted 10 October, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

Comments: 15 pages, 14 figures

Journal ref: J. Fluid Mech. 882, A13 (2020)

arXiv:1905.12224 [pdf, other]

Accelerated Sparsified SGD with Error Feedback

Authors: Tomoya Murata, Taiji Suzuki

Abstract: A stochastic gradient method for synchronous distributed optimization is studied. For reducing communication cost, we particularly focus on utilization of compression of communicated gradients. Several work has shown that {\it{sparsified}} stochastic gradient descent method (SGD) with {\it{error feedback}} asymptotically achieves the same rate as (non-sparsified) parallel SGD. However, from a view… ▽ More A stochastic gradient method for synchronous distributed optimization is studied. For reducing communication cost, we particularly focus on utilization of compression of communicated gradients. Several work has shown that {\it{sparsified}} stochastic gradient descent method (SGD) with {\it{error feedback}} asymptotically achieves the same rate as (non-sparsified) parallel SGD. However, from a viewpoint of non-asymptotic behavior, the compression error may cause slower convergence than non-sparsified SGD in early iterations. This is problematic in practical situations since early stop** is often adopted to maximize the generalization ability of learned models. For improving the previous results, we propose and theoretically analyse a sparsified stochastic gradient method with error feedback scheme combined with {\it{Nesterov's acceleration}}. It is shown that the necessary per iteration communication cost for maintaining the same rate as vanilla SGD can be smaller than non-accelerated methods in convex and even in nonconvex optimization problems. This indicates that our proposed method makes a better use of compressed information than previous methods. Numerical experiments are provided and empirically validates our theoretical findings. △ Less

Submitted 18 June, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

Comments: 25 pages, 16 figures

arXiv:1905.01591 [pdf, other]

Learning Graph Neural Networks with Noisy Labels

Authors: Hoang NT, Choong Jun **, Tsuyoshi Murata

Abstract: We study the robustness to symmetric label noise of GNNs training procedures. By combining the nonlinear neural message-passing models (e.g. Graph Isomorphism Networks, GraphSAGE, etc.) with loss correction methods, we present a noise-tolerant approach for the graph classification task. Our experiments show that test accuracy can be improved under the artificial symmetric noisy setting. We study the robustness to symmetric label noise of GNNs training procedures. By combining the nonlinear neural message-passing models (e.g. Graph Isomorphism Networks, GraphSAGE, etc.) with loss correction methods, we present a noise-tolerant approach for the graph classification task. Our experiments show that test accuracy can be improved under the artificial symmetric noisy setting. △ Less

Submitted 4 May, 2019; originally announced May 2019.

Comments: 5 pages, 4 figures, 3 tables; Appeared as a poster presentation at Limited Labeled Data (LLD) Workshop, ICLR 2019

arXiv:1809.01765 [pdf, other]

Sample Efficient Stochastic Gradient Iterative Hard Thresholding Method for Stochastic Sparse Linear Regression with Limited Attribute Observation

Authors: Tomoya Murata, Taiji Suzuki

Abstract: We develop new stochastic gradient methods for efficiently solving sparse linear regression in a partial attribute observation setting, where learners are only allowed to observe a fixed number of actively chosen attributes per example at training and prediction times. It is shown that the methods achieve essentially a sample complexity of $O(1/\varepsilon)$ to attain an error of $\varepsilon$ und… ▽ More We develop new stochastic gradient methods for efficiently solving sparse linear regression in a partial attribute observation setting, where learners are only allowed to observe a fixed number of actively chosen attributes per example at training and prediction times. It is shown that the methods achieve essentially a sample complexity of $O(1/\varepsilon)$ to attain an error of $\varepsilon$ under a variant of restricted eigenvalue condition, and the rate has better dependency on the problem dimension than existing methods. Particularly, if the smallest magnitude of the non-zero components of the optimal solution is not too small, the rate of our proposed {\it Hybrid} algorithm can be boosted to near the minimax optimal sample complexity of {\it full information} algorithms. The core ideas are (i) efficient construction of an unbiased gradient estimator by the iterative usage of the hard thresholding operator for configuring an exploration algorithm; and (ii) an adaptive combination of the exploration and an exploitation algorithms for quickly identifying the support of the optimum and efficiently searching the optimal parameter in its support. Experimental results are presented to validate our theoretical findings and the superiority of our proposed methods. △ Less

Submitted 30 November, 2018; v1 submitted 5 September, 2018; originally announced September 2018.

Comments: 23 pages, 2 figures

arXiv:1808.08675 [pdf, ps, other]

Exploring the Applications of Faster R-CNN and Single-Shot Multi-box Detection in a Smart Nursery Domain

Authors: Somnuk Phon-Amnuaisuk, Ken T. Murata, Praphan Pavarangkoon, Kazunori Yamamoto, Takamichi Mizuhara

Abstract: The ultimate goal of a baby detection task concerns detecting the presence of a baby and other objects in a sequence of 2D images, tracking them and understanding the semantic contents of the scene. Recent advances in deep learning and computer vision offer various powerful tools in general object detection and can be applied to a baby detection task. In this paper, the Faster Region-based Convolu… ▽ More The ultimate goal of a baby detection task concerns detecting the presence of a baby and other objects in a sequence of 2D images, tracking them and understanding the semantic contents of the scene. Recent advances in deep learning and computer vision offer various powerful tools in general object detection and can be applied to a baby detection task. In this paper, the Faster Region-based Convolutional Neural Network and the Single-Shot Multi-Box Detection approaches are explored. They are the two state-of-the-art object detectors based on the region proposal tactic and the multi-box tactic. The presence of a baby in the scene obtained from these detectors, tested using different pre-trained models, are discussed. This study is important since the behaviors of these detectors in a baby detection task using different pre-trained models are still not well understood. This exploratory study reveals many useful insights into the applications of these object detectors in the smart nursery domain. △ Less

Submitted 26 August, 2018; originally announced August 2018.

Comments: 11 pages, 4 figures

arXiv:1808.08558 [pdf, other]

Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error

Authors: Taiji Suzuki, Hiroshi Abe, Tomoya Murata, Shingo Horiuchi, Kotaro Ito, Tokuma Wachi, So Hirai, Masatoshi Yukishima, Tomoaki Nishimura

Abstract: Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound. However, there is still huge gap between a practically effective comp… ▽ More Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound. However, there is still huge gap between a practically effective compression method and its rigorous background of statistical learning theory. To resolve this issue, we develop a new theoretical framework for model compression and propose a new pruning method called {\it spectral pruning} based on this framework. We define the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes and show that the compression ability is essentially controlled by this quantity. Moreover, we present a sharp generalization error bound of the compressed model and characterize the bias--variance tradeoff induced by the compression procedure. We apply our method to several datasets to justify our theoretical analyses and show the superiority of the the proposed method. △ Less

Submitted 13 July, 2020; v1 submitted 26 August, 2018; originally announced August 2018.

Comments: 17 pages, 4 figures. Accepted in IJCAI-PRICAI 2020. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pages 2839--2846

arXiv:1804.07059 [pdf, other]

Exploring Partially Observed Networks with Nonparametric Bandits

Authors: Kaushalya Madhawa, Tsuyoshi Murata

Abstract: Real-world networks such as social and communication networks are too large to be observed entirely. Such networks are often partially observed such that network size, network topology, and nodes of the original network are unknown. In this paper we formalize the Adaptive Graph Exploring problem. We assume that we are given an incomplete snapshot of a large network and additional nodes can be disc… ▽ More Real-world networks such as social and communication networks are too large to be observed entirely. Such networks are often partially observed such that network size, network topology, and nodes of the original network are unknown. In this paper we formalize the Adaptive Graph Exploring problem. We assume that we are given an incomplete snapshot of a large network and additional nodes can be discovered by querying nodes in the currently observed network. The goal of this problem is to maximize the number of observed nodes within a given query budget. Querying which set of nodes maximizes the size of the observed network? We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-arm bandit (MAB) algorithm for identifying which nodes to be queried. Our contributions include: (1) $i$KNN-UCB, a novel nonparametric MAB algorithm, applies $k$-nearest neighbor UCB to the setting when the arms are presented in a vector space, (2) provide theoretical guarantee that $i$KNN-UCB algorithm has sublinear regret, and (3) applying $i$KNN-UCB algorithm on synthetic networks and real-world networks from different domains, we show that our method discovers up to 40% more nodes compared to existing baselines. △ Less

Submitted 19 April, 2018; originally announced April 2018.

Comments: 15 pages, 6 figures, currently under review

arXiv:1708.02698 [pdf, ps, other]

On degenerations of projective varieties to complexity-one T-varieties

Authors: Kiumars Kaveh, Christopher Manon, Takuya Murata

Abstract: Let $R$ be a positively graded finitely generated $\textbf{k}$-domain with Krull dimension $d+1$. We show that there is a homogeneous valuation $\mathfrak{v}: R \setminus \{0\} \to \mathbb{Z}^d$ of rank $d$ such that the associated graded $\text{gr}_\mathfrak{v}(R)$ is finitely generated. This then implies that any polarized $d$-dimensional projective variety $X$ has a flat deformation over… ▽ More Let $R$ be a positively graded finitely generated $\textbf{k}$-domain with Krull dimension $d+1$. We show that there is a homogeneous valuation $\mathfrak{v}: R \setminus \{0\} \to \mathbb{Z}^d$ of rank $d$ such that the associated graded $\text{gr}_\mathfrak{v}(R)$ is finitely generated. This then implies that any polarized $d$-dimensional projective variety $X$ has a flat deformation over $\mathbb{A}^1$, with reduced and irreducible fibers, to a polarized projective complexity-one $T$-variety (i.e. a variety with a faithful action of a $(d-1)$-dimensional torus $T$). As an application we conclude that any $d$-dimensional complex smooth projective variety $X$ equipped with an integral Kähler form has a proper $(d-1)$-dimensional Hamiltonian torus action on an open dense subset that extends continuously to all of $X$. △ Less

Submitted 13 January, 2020; v1 submitted 8 August, 2017; originally announced August 2017.

Comments: Presentation improved in many places and many typos fixed

MSC Class: 14D06; 13D10; 14M25

arXiv:1705.09868 [pdf]

Energy flow in biological system: Bioenergy transduction of V1-ATPase molecular rotary motor from E. hirae

Authors: Ichiro Yamato, Takeshi Murata, Andrei Khrennikov

Abstract: We classify research fields in biology into those on flows of materials, energy, and information. As a representative energy transducing machinery in biology, our research target, V1-ATPase from a bacterium Enterococcus hirae, a typical molecular rotary motor is introduced. Structures of several intermediates of the rotary motor are described and the molecular mechanism of the motor converting che… ▽ More We classify research fields in biology into those on flows of materials, energy, and information. As a representative energy transducing machinery in biology, our research target, V1-ATPase from a bacterium Enterococcus hirae, a typical molecular rotary motor is introduced. Structures of several intermediates of the rotary motor are described and the molecular mechanism of the motor converting chemical energy into mechanical energy is discussed. Comments and considerations on the information flow in biology, especially on the thermodynamic entropy in quantum physical and biological systems, are added in a separate section containing the biologist friendly presentation of this complicated question. △ Less

Submitted 27 May, 2017; originally announced May 2017.

Comments: Progress in Biophysics and Molecular Biology, 2017

Journal ref: Progress in Biophysics and Molecular Biology 130, Part A, 33-38 (2017)

arXiv:1703.00439 [pdf, other]

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

Authors: Tomoya Murata, Taiji Suzuki

Abstract: In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the machine learning community, because mini-batch settings stabilize the gradient estimate and can easily make good use of parallel computing. The core of our propo… ▽ More In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the machine learning community, because mini-batch settings stabilize the gradient estimate and can easily make good use of parallel computing. The core of our proposed method is the incorporation of our new "double acceleration" technique and variance reduction technique. We theoretically analyze our proposed method and show that our method much improves the mini-batch efficiencies of previous accelerated stochastic methods, and essentially only needs size $\sqrt{n}$ mini-batches for achieving the optimal iteration complexities for both non-strongly and strongly convex objectives, where $n$ is the training set size. Further, we show that even in non-mini-batch settings, our method achieves the best known convergence rate for both non-strongly and strongly convex objectives. △ Less

Submitted 19 September, 2017; v1 submitted 1 March, 2017; originally announced March 2017.

Comments: 27 pages, 9 figures

arXiv:1610.01464 [pdf, ps, other]

doi 10.1088/1361-6633/aa5e6c

Towards a Unified Model of Neutrino-Nucleus Reactions for Neutrino Oscillation Experiments

Authors: S. X. Nakamura, H. Kamano, Y. Hayato, M. Hirai, W. Horiuchi, S. Kumano, T. Murata, K. Saito, M. Sakuda, T. Sato, Y. Suzuki

Abstract: A precise description of neutrino-nucleus reactions will play a key role in addressing fundamental questions such as the leptonic CP violation and the neutrino mass hierarchy through analyzing data from next-generation neutrino oscillation experiments. The neutrino energy relevant to the neutrino-nucleus reactions spans a broad range and, accordingly, the dominant reaction mechanism varies across… ▽ More A precise description of neutrino-nucleus reactions will play a key role in addressing fundamental questions such as the leptonic CP violation and the neutrino mass hierarchy through analyzing data from next-generation neutrino oscillation experiments. The neutrino energy relevant to the neutrino-nucleus reactions spans a broad range and, accordingly, the dominant reaction mechanism varies across the energy region from quasi-elastic scattering through nucleon resonance excitations to deep inelastic scattering. This corresponds to transitions of the effective degree of freedom for theoretical description from nucleons through meson-baryon to quarks. The main purpose of this review is to report our recent efforts towards a unified description of the neutrino-nucleus reactions over the wide energy range; recent overall progress in the field is also sketched. Starting with an overview of the current status of neutrino-nucleus scattering experiments, we formulate the cross section to be commonly used for the reactions over all the energy regions. A description of the neutrino-nucleon reactions follows and, in particular, a dynamical coupled-channels model for meson productions in and beyond the $Δ$(1232) region is discussed in detail. We then discuss the neutrino-nucleus reactions, putting emphasis on our theoretical approaches. We start the discussion with electroweak processes in few-nucleon systems studied with the correlated Gaussian method. Then we describe quasi-elastic scattering with nuclear spectral functions, and meson productions with a $Δ$-hole model. Nuclear modifications of the parton distribution functions determined through a global analysis are also discussed. Finally, we discuss issues to be addressed for future developments. △ Less

Submitted 11 April, 2017; v1 submitted 4 October, 2016; originally announced October 2016.

Comments: 68 pages, 33 figures; (v2) reference added, author name in metadata corrected; (v3) Discussion extended in Sec. I and IV, Fig. 10 added to compare with a previous result on two-pion productions, Table I extended to include a comparison with a recent T2K result on coherent pion production

Report number: KEK-TH-1905, J-PARC-TH-0052

Journal ref: Reports on Progress in Physics 80, 056301 (2017)

arXiv:1604.03237 [pdf, ps, other]

doi 10.7566/JPSCP.12.010049

Neutrino Induced 4He Break-up Reaction -- Application of the Maximum Entropy Method in Calculating Nuclear Strength Function

Authors: T. Murata, W. Horiuchi, T. Sato, S. X. Nakamura

Abstract: The maximum entropy method is examined as a new tool for solving the ill-posed inversion problem involved in the Lorentz integral transformation (LIT) method. As an example, we apply the method to the spin-dipole strength function of 4He. We show that the method can be successfully used for inversion of LIT, provided the LIT function is available with a sufficient accuracy. The maximum entropy method is examined as a new tool for solving the ill-posed inversion problem involved in the Lorentz integral transformation (LIT) method. As an example, we apply the method to the spin-dipole strength function of 4He. We show that the method can be successfully used for inversion of LIT, provided the LIT function is available with a sufficient accuracy. △ Less

Submitted 11 April, 2016; originally announced April 2016.

Comments: 5 pages, 2 figures. Poster presented by TM at the International Workshop on Neutrino-Nucleus Interaction in the Few-GeV Region (NuInt15), Novenber 16-21 2015, Osaka, Japan

arXiv:1603.02412 [pdf, ps, other]

Stochastic dual averaging methods using variance reduction techniques for regularized empirical risk minimization problems

Authors: Tomoya Murata, Taiji Suzuki

Abstract: We consider a composite convex minimization problem associated with regularized empirical risk minimization, which often arises in machine learning. We propose two new stochastic gradient methods that are based on stochastic dual averaging method with variance reduction. Our methods generate a sparser solution than the existing methods because we do not need to take the average of the history of t… ▽ More We consider a composite convex minimization problem associated with regularized empirical risk minimization, which often arises in machine learning. We propose two new stochastic gradient methods that are based on stochastic dual averaging method with variance reduction. Our methods generate a sparser solution than the existing methods because we do not need to take the average of the history of the solutions. This is favorable in terms of both interpretability and generalization. Moreover, our methods have theoretical support for both a strongly and a non-strongly convex regularizer and achieve the best known convergence rates among existing nonaccelerated stochastic gradient methods. △ Less

Submitted 8 March, 2016; originally announced March 2016.

Comments: 30 pages, 12 figures

arXiv:1501.05741 [pdf, ps, other]

Extraction of Neutrino Flux from the Inclusive Muon Cross Section

Authors: Tomoya Murata, Toru Sato

Abstract: We have studied a method to extract neutrino flux from the data of neutrino-nucleus reaction by using maximum entropy method. We demonstrate a promising example to extract neutrino flux from the inclusive cross section of muon production without selecting a particular reaction process such as quasi-elastic nucleon knockout. We have studied a method to extract neutrino flux from the data of neutrino-nucleus reaction by using maximum entropy method. We demonstrate a promising example to extract neutrino flux from the inclusive cross section of muon production without selecting a particular reaction process such as quasi-elastic nucleon knockout. △ Less

Submitted 23 January, 2015; originally announced January 2015.

Comments: 7 pages, 4 figures, 1 table. Talk presented by TM at the International Workshop on Neutrino Factories and Future Neutrino Facilities (NUFACT 2014), August 25-30 2014, Glasgow, UK

arXiv:1407.4990 [pdf, other]

doi 10.1103/PhysRevE.90.012806

Detecting network communities beyond assortativity-related attributes

Authors: Xin Liu, Tsuyoshi Murata, Ken Wakita

Abstract: In network science, assortativity refers to the tendency of links to exist between nodes with similar attributes. In social networks, for example, links tend to exist between individuals of similar age, nationality, location, race, income, educational level, religious belief, and language. Thus, various attributes jointly affect the network topology. An interesting problem is to detect community s… ▽ More In network science, assortativity refers to the tendency of links to exist between nodes with similar attributes. In social networks, for example, links tend to exist between individuals of similar age, nationality, location, race, income, educational level, religious belief, and language. Thus, various attributes jointly affect the network topology. An interesting problem is to detect community structure beyond some specific assortativity-related attributes $ρ$, i.e., to take out the effect of $ρ$ on network topology and reveal the hidden community structure which are due to other attributes. An approach to this problem is to redefine the null model of the modularity measure, so as to simulate the effect of $ρ$ on network topology. However, a challenge is that we do not know to what extent the network topology is affected by $ρ$ and by other attributes. In this paper, we propose Dist-Modularity which allows us to freely choose any suitable function to simulate the effect of $ρ$. Such freedom can help us probe the effect of $ρ$ and detect the hidden communities which are due to other attributes. We test the effectiveness of Dist-Modularity on synthetic benchmarks and two real-world networks. △ Less

Submitted 14 July, 2014; originally announced July 2014.

Comments: 10 pages, 8 figures

Journal ref: Physical Review E 90, 012806 (2014)

Showing 1–50 of 60 results for author: Murata, T